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© Biotin biosynthesis in bacillus subttlis. 

© The present invention is directed to DNA sequences of genes that encode a biotin biosynthetic enzyme of B 
subti'is or of a closely related species thereof, vectors comprising such DNA sequences, cells comprising such 
DNA sequences and vectors and a process for the production of biotin by such cells. 



a. 

LU 



£a n * Xarn* (UK) BuS'ness Se r ^i7(?5 
3 Kt 3 0 9/3 3 4> 



EP 0 635 572 A2 



The present invention relates to the production of biottn using a genetically engineered organism 
Biotin (vitamin B s or vitamin H), a coenzyme for carboxylation and decarboxylation reactions, is an 
essential metabolite for living cells. Exogenous biotin is required for most higher organisms; however many 
bacteria synthesize their own biotin. 

5 The enzymatic steps involved in the biotin synthetic pathway from pimelyl-CoA (PmCoA) to b:otin have 

been elucidated in Escherichia coli and Bacillus sphaericus [Fig 1; reviewed in Perkins and Pero. 
Bacillus subtilis and other Gram- Positive Bacteria, ed. Sonenshem. Hoch, and Losick, A^e^. Soc. of 
Microbiology, pp 325-329 (1993)]. The steps include the conversions of 1) pimclyl-CoA to 7-kcto-8-ammo 
peiargonic acid (7-KAP or KAPA) oy 7-KAP synthetase {bioF), 2) 7-KAP to 7,8-aiamino-pelargonic acid 

w (DAP A i by DAPA aminotransferase (bio A); 3) DAPA to dethiobiotin (DTB) by DTB synthetase (bioD); and 4) 
DTB to biotin by biotin synthetase {bioB). Synthesis of PmCoA reportedly involves different enzymatic steps 
in different microorganisms. The E, coli genes involved in steps preceding pimelyl-CoA synthesis include 
bioC [Otsuka et al , J. Biol. Chem. 263, 19577-19585 (1988)] and bioH [O'Regan et ai , Nucleic Acids Res. 
17, 8004 (1989)] . In B. sphaericus, two different genes, bioX and bioW, are thought to be involved in 

ib PmCoA synthesis. BioX is thought to be involved in pimclatc biosynthesis [Gloecklcr ct al., Gene 87. 63 70 
(1990)], and bioW has been shown to encode pimelyl-CoA synthetase which converts pimelic acid (PmA) lo 
PmCoA [Ploux et al., Biochem, J, 287, 685-690 (1992)]. Neither B, sphaericus gene, bioW or bioX. has 
significant sequence similarity with the E. colt bioC and bioH genes either at the nucleotide or protein level 
[Gtoeckler et al. (1990) supra). 

?o In F. coli, the biotin biosynthetic genes are located in three or more operons in the chromosome The 
bioA gene is located in one operon and the bioBFCD genes are located in a second closely linked operon. 
The bioH gene is unlinked to the other bio genes (Fig. 2; Eisenberg, M.A. 1987 in Escherichia coli and 
Salmonella typhimunum. Cellular and Molecular Biology, vol. 1, Amer. Soc. Micro. Wash. D.C .). 

In B. sphaericus the organization of the bio genes ts clearly different from that in E. coli. Gloeckler et 

25 al. [(1990) supra] have isolated and characterized two unlinked DNA fragments from 8. sphaericus that 
encode bio genes. One fragment contains an operon encoding the bioD, bioA, bioY, and bioB genes, and 
the ether fragment contains an operon encoding the bioX, bioW and bioF genes (Fig 2) The order and 
clustering of bio genes is different in E. coli and B. sphaericus (Fig. 2). 

Fisher (U.S. Patent 5,110,731) provides a system for producing biotin wherein the genes of the biotin 

jo cperon of E. coli are transformed into, and expressed in, a retention-deficient strain of E, coli. 

Gloeckler et al. (U.S. Patent 5,096,823) describes genes involved in the biosynthesis of biotin in B. 
sphaericus bioA, bioD, bioF, bioC, and bioH B. sphaericus genes for bioA and bioD were cloned into 
both E. coii and B. subtilis. The bio A and bioD genes were stably integrated into B. subtilis Bio" 
auxotrophs. and prototrophic strains wore selected 

35 British Patent 2,216,530 provides ptasmids containing gene(s) for E. coli bioA, bioB, bioC, bioD, and 
bioF isolated from ether E, coli genetic material, e.g.. control sequences. The plasmids are capable of 
replicating and being expressed in non-E. coli strains, preferably in yeast. 

Three biotin synthesis deficient mutants of B, subtilis {bioA, bioB, and a gene termed bio112 which 
may be analogous to E. coli bioF) have been reported [Pat, Jour. Beet. 121_, 1-8 (1975); and Gloeckler et al 

40 (1990) supra]. 

Niopon Zcon Co. Ltd. (U.S. Patent 4,563,426) discloses biotin fermentation that includes adding pimelic 
acid after cultunng for about 24 hours. Transgene SA and Nippon Zeon Co. Ltd. (European Patent 
Application Publication No. 379 428) disclose adding pimelic acid to a biotin fermentation medium. 

The invention generally provides the genes of the biotin synthetic operon of 8. subtilis and closely 

^5 related species to be used for high level production of biotin. Specific aspects of the invention are 
described in greater detail below We have specifically identified, cloned, and engineered a previously 
unknown gene {biof), which encodes a cytochrome P-450-like enzyme. We have also developed a strategy 
to overexpress the entire B subtilis bio operon (which, when engineered with a strong promoter, is 
unexpectedly toxic to E coii) by cloning two 0/o operon fragments separately, combining them in vitro, and 

so transforming the host organism with the resulting ligated construction. Cloning the two fragments was 
further complicated by difficulty obtaining the 5* end of the operon, due to toxicity in E. coli The invention 
particularly features the full-length operon obtained by the above strategy. These and other features of the 
invention are described in greater detail below 

In one aspect, therefore, the invention features a DNA comprising a DNA sequence selected from tne 

bb group consisting of: (a) a DNA sequence of a gene that encodes a biotin biosynthetic enzyme of Bacillus 
subtilis, cr of a species closely related to Bacillus subtilis, (b) a DNA sequence that encodes a biologically 
active *ragment of (a); or (c) a DNA sequence that is substantially homologous to (a) or (b). Also, as used 
herein, a species which is "closely related" tc 0. subtilis includes a member of a cluster of Bacillus soo 
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represented by B. subtilis The cluster includes, e g . fl. subtilis B pumilus, B. licheniformis B. 
amyfoliquefaciens, B. megaterium, B. cereus and 6. thuringiensis. The members of the B. subtilis cluster 
are genetically and metabolically divergent from the more distantly related Bacillus spp. of clusters 
represented by B. sphaencus and B. stearothermophilus [Fig. 3, Priest, in Bacillus subtilis ana other 

5 Gram-Positive Bacteria, supra pp. 3-16: Stackebrandt, et al. J, Gen. Micro. 133, 2523-2529 M987)]. 

As noted above, we have discovered a novel gene {biof) present in B. subtilis and closely related 
species thereof, which is particularly important to deregulated production of biotin, and that gene is included 
m the DNA of preferred embodiments of the first aspect of the invention. Also preferably, at least bioA and 
bioB are included in the DNA of the first aspect. BioD, bioF, bioW, and 0RF2 (encoding a 8-keXo 

w reductase-lik© enzyme) may also be advantageously included in the DNA of the invention. At least two of 
the above-defined genes may be included in the DNA. The DNA sequences may be operaoiy linked to a 
transcriptional promoter, e.g., a constitutive promoter such as a promoter derived from the SP01 bac- 
teriophage The entire biotin operon of Bacillus subtilis, or a closely related species thereof, may be linked 
to a single transcriptional promoter. Moreover, we have learned that it is particularly useful to include a 

7b second promoter i.e., one or more of the DNA sequences is/are opcrably hnked to a first transcriptional 
promoter, and at least a second one of the genes is operably linked to a second transcriptional promoter. 
The first promoter may be operably linked to one or more of bloA, bioB, bioD, bioF, and bioW of B. 
subtilis, or a closely related species thereof. The other promoter may be operably linked to one or more of 
biol, bioA, bioB, or a combination thereof, In a particularly preferred embodiment, the first promoter 

po cnntrolls transcription of the entire operon, and transcription of biol, optionally with bioA and or bioB, is also 
controlled by the second promoter. The DNA may include a mutated regulatory site of a biotin operon of B. 
subtilis or a closely related species, such as an operator, a promoter, a site of transcription termination, a 
site of mRNA processing, a ribosome binding site, or a site of catabolite repression. By mutation, we mean 
an insertion, a substitution, or a deletion with respect to the wild type regulatory site. 

25 A second aspect of the invention relates to our discovery of Bacillus subtilis biol. This aspect features 
that gene or a gene specifically hybndtzable to Bacillus subtilis biol. It also features a biotin biosynthetic 
enzyme encoded by such a gene. 

The invention also features vectors comprising such DNAs as described aoove and cells comprising 
such a vector or such DNAs. Preferably, the DNA is amplified to multiple copies in such cells. Also 

30 preferably, the DNA is stably integrated into the chromosome of the cell. The DNA may be integrated at 
multiple sites tn the chromosome, at least one of which is the bio locus, and in multiple copies at each such 
site Also preferably, the cell is characterized by a mutation that deregulates production of biotin or a biotin 
precursor, in addition to the presence of the DNA. Such mutated cells produce an increase in biotin in 
comparison to wild-type cells lacking the DNA Such a mutation may be one that confers resistance to 

35 azelaic acid and/or it may be a mutation in birA. 

The above described cells are used in methods of producing biotin or a precursor thereof in which the 
ceils are cultured for a time and under conditions which allow synthesis of biotin or the precursor, and biotin 
or precursor is then isolated, preferably from the extracellular media of the cell. 

Yet another aspect of the invention features a recombinant protein encoded by a DNA as described 

40 above or a recombinant biotin biosynthetic enzyme comprising an amino acid sequence that is substantially 
homologous to the amino acid sequence of a biotin biosynthetic enzyme of Bacillus subtilis, or a closely 
^elated species thereof. 

A final aspect of the invention features a method of selecting a mutant Bacillus subtilis cell 
characterized in being deregulated for biotin production, by: (a) providing a population of Bacillus subtilis 

45 celt; (b) allowing the population to reproduce in the presence of azelaic acid; (c) selecting a cell that is 
resistant to azelaic acid; and (d) screening the cell, or a daughter cell further mutated to deregulate biotin 
production, for the ability to overproduce biotin. 

The abeve description of the invention may be further understood by reference to the following 
definitions and explanations. The vector DNA may include a DNA sequence which is substantially 

so homologous to a gene (or to a DNA sequence that encodes a biologically active fragment of the gene) that 
encodes a biotin biosynthetic enzyme of Bacillus subtilis, or a closely related species thereof. The DNA 
sequence diverges from the wild type sequence by including a mutation, e.g., a deletion, an insertion, or a 
point mutation, that enhances the synthesis of biotin when the DNA sequence is expressed in a cell At 
least two, three, four, five, or six, or preferably all, of the biotin operon genes, may be operably linked to a 

st> transcriptional promoter to yield a messenger RNA. As used herein, "operon" refers to one cr mote genes 
co-tran$cribed from the same promoter. "Biotin operon" refers to a group of genes whose gene products 
a r e involved in an aspect of biotin biosynthesis. By "promoter" is meant a nucleic acid sequence 
recognised by an RNA polymerase enzyme that initiates transcription of a gene located in the 3' direction of 
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the promoter to yield a messenger RNA. By "operably linked to a transcriptional promoter" is meant that 
the gene is sufficiently proximal to the promoter for an RNA transcript initiated at the promoter tc include 
messenger RNA that is complementary to that gene. The transcriptional promoter is either a constitutive 
promoter, e.g., a promoter derived from the SP01 bacteriophage, or an inducible promoter. 

5 Any of the genes of the operon can include a mutation that enhances the synthesis of biotin when the 

DNA --^uence is expressed in a cell In a related embodiment, a regulatory site m the biotin cperon, cr -n a 
gene i..r T he biotin operon, can be altered, e.g., by mutation, so as to increase the level of Oiotin produced in 
a coll Examples of regulatory sites that can bo altered include a site of transcription termination, an 
operator site, a site of mRNA processing, a ribosome binding site, or a site of cataboi'te repression. 

70 Any of the vectors of the invention can be included in a host cell. The preferred host cell is a B. subtilis 
cell for the reasons discussed below. However, the vectors of the invention can also be inserted into 
another type of host cell, e.g., an E. coli cell, or any host cell containing the apparati necessary to maintain 
the vector and'or to express a gene of the biotin operon located on the vector. Where the host cell >$ used 
for expression, it is also desirable for the host cell to have the ability to secrete biotin into the extracellular 

ib medium, as docs B, subtilis, simplifying collection of tho biotin product. Some of the host cells that can be 
used include, but are not limited to, Gram-positive bacteria, e.g., B. subtilis cells such as B. subtilis 168, 
W23, or natto strains, other Bacillus strains such as B, licheniformis, B. amyloliquefaciens, B. pumilus. B, 
megatenum. or B. cereus [for Bacillus strains see "Bacillus Genetic Stock Center Catalogue of Strains", 
Ohio State Univ. Depart. Biochem., Columbus, OH. edited by D. H. Dean (1986) 3rd edition], or other Gram- 

?o positive cells such as Lactococcus, Lactobacillus, Corynebacterium, Brevibacterium, Staphylococcus, 
Streptomyces, or Clostridium, Gram-negative bacteria, e.g., strains of E. coli, Salmonella, Serratia, or 
Klebsiella: or fungal cells, e.g., yeast. For strains and vectors useful with these Gram-positive cells see 
"Bacillus subtilis and other Gram-Positive Bacteria", [edited by Sonenshein, Hoch, and Losick, Amer. Soc. 
of Microbiology (1993)]. Biotin genes from Gram-positive organisms can be expressed in Gram-negative 

25 bacteria (Gloeckler et a!., supra) and even fungal genes from Saccharomyces cerevtsiae have been 
expressed in E, coli [Zhang et al., Archives of Biochemistry and Biophysics 309, 29-35 (1994)]. 

Where the vector is an extrachrcmosomal element it can be amplified, i e to multiple copies, in the 
cell. Alternatively, if the vector is not an extrachromosomal element, the vector can be stably integrated into 
the chromosome of the cell. Integrated vectors also can be amplified to multiple copies in the eel! i.b , 

30 integration can occur at multiple sites, or in multiple copies at each site. Integration may occur at a random 
site on the chromosome, or preferably integration is directed to a preferred chromosomal locus, e.g., the 
bio locus The whole vector can integrate into the chromosome, or only the biotin biosynthetic sequences 
themselves can be integrated into the chromosome absent at feast a portion of the non-biotin biosynthetic 
sequences e.g., the replicon sequences. The cell containing the vector or biotin operon sequences can 

35 further be deregulated for biotin production. 

By "deregulated for biotin production" is meant that a negative limitation that controls the level of biotin 
biosynthesis has been at least partially removed from the cell. A negative limitation includes, but is net 
limited to a regulatory protein (e.g., a repressor), a site of action of a regulatory protein (e.g., ar, operator), 
inhibitory factor, or a low level of a rate limiting enzyme The cell can include a mutation in a genetic locus 

40 that complements the birA locus of E. coli Examples of B. subtilis strains that include a mutation that 
causes an increase in biotin secretion include, but arc not limited to, tho strains HB3, HB9, HB15, HB43, o- 
DB9, a-DBl2, a-DB16, and &-DB17, or any of the mutant or engineered strains listed in Table 8. Table 9, 
Table 10. Biotin is preferably secreted into the extracellular media to a concentration of at least 0.1 mg I 1 
mg'l, 10 mg/l, 100 mg/l, 300 mg/l, 500 mg/l, 750 mg/l, or 1.0 g/l. Preferably, the host cell is B. subtilis, but «t 

45 can also be any of the above-nsted host strains. By "vitamer" or "biotin vitamer" is meant any of the 
compounds preceding biotin in the biosynthetic pathway that can be used to feed yeast, e g the following 
compounds shown in Fig. 1: 7-KAP, DAPA, or desthiobiotin. The term "biotin precursor" includes each of 
the biotin vitamers listed above, as well as PmA and PmCoA. By substantially homologous we mean having 
sufficient homology to yield specific hybridization under conditions that allow hybridization to DNA of 

so Bacillus species within the cluster that includes B. subtilis, but are too stringent to allow hybridization to 
DNA of organisms outside the cluster of species closely related to B. subtilis. Suitable probes for this 
purpose are provided below. Suitable hybridizations use relatively stringent conditions. For example: 
nitrocellulose filters containing denatured DNA are incubated with a radioactively labeled DNA or RNA 
probe in the presence cf 5XSSC (0.75M NaCI and 0.075M Na citrate, pH 7.0), 10-50% formamide IX 
Denhardt's solution (0.02% bovine serum albumin, 0.02% Ficoll, 0.02% pyrollidone), and lOOuyml 
denatured salmon sperm DNA at 37-42 9 C. Those skilled m the art will understand that stringency can oe 
gradually increased (e.g., by increasing formamide concentration or temperature) until suitable specificity is 
obtained (i e , non-specific binding is reduced or eliminated) 
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By "biotm synthetic enzyme" is meant any one of the enzymes that form the biotin biosynthetic 
patnway as shewn m Fig, 1, or discussed herein, as well as enzymes encoded by genes newly disclosed 
herein, e.g.. bioi, or 0RF2. The term "biotin biosynthetic enzyme" also includes a portion or fragment of a 
native biotin biosynthetic enzyme that performs the biochemical function of a biotin biosynthetic enzyme of 
5 B. subtilis. The size of such a portion or fragment of a biotin biosynthetic enzyme is determined by the 
functional requirement that it retain the biochemical activity of the native enzyme There are many examples 
\r. the literature of enzymes that retain one or more activities after shortening of the polypeptide c^.air by 
osteolysis or by truncating the associated geno [for example, see Dautry-Varsat and Cohen, J. Biol. Chem 
252, 7685-7689 (1977)]. 

70 As used herein, the term "fragment" or "portion", as applied to a polypeptide, will ordinarily be at least 
about 10 contiguous ammo acids, typically at least about 20 contiguous amino acids, usually at toast about 
30 contiguous ammo acids, preferably at least 50 contiguous amino acids, and most preferably at least 
about 60 to 80 contiguous amino acids in length Similarly, by the term "fragment" in the ccntext cf a 
nucleic acid is meant a DNA sequence that encodes a polypeptide fragment as defined above. The ability 

is of a candidate fragment to perform the biological activity of the corresponding naturally-occurring enzyme 
can be assessed by methods known to those skilled in the art, including, but not limited to, the following 
protocols: Assays of pimelyl-CoA synthetase (b/oW), 7-KAP synthetase (bio?). DAPA aminotransferase 
(O/oA), and DTB synthetase (£>/oD) are described by Izumi et al. [Methods in Enzy. 62, 326-338 (1979);. A 
celi-free assay of biotin synthetase (bioB) is described by Ifuku et al. [Biosci. Biotech, Biochem, 56, 1780- 

90 1785, (1992)]. The product cf biol may be characterized as a cytochrome P-450 by the spectral 
determinations described by Omura et al. [J. Bioi Chem. 239, 2310-2378 (1964)]. 

By "recombinant" is meant that the gene encoding the enzyme has been removed from .ts naturally 
occurring site in the B. subtilis chromosome and inserted, either permanently or transiently, into a vector by 
techniques of genetic engineering known to one skilled in the art. Preferably, the vector includes sequences 

25 allowing fcr the expression of the inserted gene. 

A "vector" as used herein refers to a nucleic acid molecule that can be introduced into a celt, e.g.. by 
transfection, by transformation, or by transduction Vectors include, but are not limited to, plasmics, 
Dacieriophages, phagemids, cosmids, and transposons. Examples of vectors for use herein include, but are 
not limited to, pBR322, pCH920, pCH92l, pUCl8, pUCl9, or pSClOL These vectors replicate in E. coli 

?o and other bacteria but do not repltcate in B. subtilis, and are thus useful as integration vectors in B. subtilis 
after addition of an appropriate selectable marker. Other vectors commonly used in B. subtilis included the 
olasmids pUBiiO, pEl94. pCi94, and their derivatives which replicate in a subtilis, or any of the 
integration ' vectors, plasmids, temperate phage vectors or transposons described in Chapters 40-44 ot 
Bacillus ^vDtilis and Other Gram - Positive Bacteria, {supra, pp 585-650). Additional vectors used in 

35 numerous microorganisms are described in "Cloning Vectors: A Laboratory Manual" [Pouwets et al. Elsevier 
(1985) with supplementary updates in 1986 and 1988], Recombinant, engineered B, subtilis DNA may also 
be inserted (by homologous recombination) and amplified in the B. subtilis chromosome without using any 
replicating plasmrd vectors. 

A "substantially pure nucleic acid," as used herein, refers to a nucleic acid sequence, segment, ot 

40 fragment that has been purified or separated from the sequences which flank it in its naturally occurring 
state, e.g., a DNA fragment that has been removed from the sequences which arc normally adjacent to the 
fragment, e.g., the sequences adjacent to the fragment in a genome in which it naturally occurs. The term 
also applies to nucleic acids which have been substantially purified from other components that naturally 
accompany it in the cell. Preferably, the "sequence encoding a biot • ~ ; osynthetic enzyme of B. subtilis" is 

45 a major component of the total purified nucleic acid sequence, e.g.. ^ least 1% or 10% of the total ruc'ei^ 
acid sequence 

"Homologous," as used herein, refers to the subunit sequence similarity between twe polymer i.:, 
molecules, e.g., between two nucleic acid molecules, e.g.. two DNA or RNA molecules, or two Dolyoept.dw 
molecules. When a subunit position in both of the two molecules is occupied by the same monomer i. 

bo subunit, e.g., if a position in each of two DNA molecules is occupied by adenine, then they are homologous 
at that position. A "best-fit" homology can be achieved by adjusting the alignment of tne sequences, "fhe 
homology between two sequences is a function of the number of matching or homologous positions, e.rj , it 
half (e g 5 positions in a polymer 10 subunits in length), of the positions in two compound sequences are 
homologous then the two sequences are 50% homologous, if 90% cf the positions, e.g., 9 of 10, are 

bb matched or homologous, the two sequences share 90% homology. There may be gaps of non-homologous 
sequences among homologous sequences. "Substantially homologous' 1 sequences are those that differ one 
from the other only by conservative substitutions. For example, where the substitution is in a nucleic acid 
sequence, the substitution either does net cause a change in amino acid at that position, or the substitution 
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results in a conservative amino acid substitution A "conservative amino acid substitution" is, for examoie, a 
substitution of one amino acid for another of the same class (e.g., amino acids That share characteristics of 
hydrophobicity, charge, pK a , or other conformational or chemical properties, e.g., valine for leucine, arginine 
for lysine) of by one or more non-conservative ammo acid substitutions, deletions, or insertions, locatec at 

5 positions of the ammo acid sequence that do not destroy the biological activity of the polypeptide (as 
cescr.bed above) An amino acid sequence is included within the scope of the invention if <t differs by a 
modification that reduces or alters the biological activity of one domain of a multiple-domain enzyme, while 
preserving a second biological activity in a second domain of th nzymc. Generally, a polypeptide is 
considered to be within tne scope of this invention if it is at least 75%, preferaoly at least 80%, or most 

jo pieferably at least 90%. homologous to the naturally occurring amino acid sequence of a biotm biosynthetic 
enzyme of 6. subtiiis. A nucleic acid sequence is considered to be within the scope of this invention if it is 
at least 70%, preferably at least 80%, or most preferably at least 90%, homologous to a naturally occurring 
nucleic acid sequence encoding a biotin biosynthetic enzyme of B. subtiiis. 

Bacterial strains containing the B. subtiiis genes or DNA sequences provided herein are usefu 1 f or 
producing high levels of biotin or of a biotin precursor, these compounds being useful in turn as a dietary 
additive for humans or animals. For instance, biotin can be supplied to a domesticated animal, e g , a cow 
chicken, or a pig, as an additive to a commercial preparation of animal feed. In addition, biotin can be 
added to a vitamin dietary supplement for human use. Biotin is also useful as a reagent for research and 
diagnostic procedures. For example, biotin is used as a non-radioactive label for proteins and nucleic acids, 

?o Biotin-labelled proteins are detectable by virtue of biotin's naturally occurring ability to bind to avidin, a 
protein found in egg-white, or to streptavidin, a biotin-binding protein produced by a streptomycete. 
In the following paragraph the figures are shortly described. 

Figure (Fig.) 1 is a schematic illustration of the biotin synthesis pathway of B. sphaericus. 
Fig. 2 is a schematic illustration of the organization of the biotin genes in E, coli, B, sphaericus, and B, 
25 subtiiis. 

Fig, 3 ts a schematic illustration of phylogenetic incoherency including the genus Bacillus. 

Fig 4 is an illustration of the nucleotide sequence of the B. subtiiis bio promoter region (SEQ ID NO:7), 
including amino acid translations of the end of the ORF4-5 reading frame (SEQ ID NO: 16) and the 
beginning of the bioW reading frame (SEQ ID NO:17). 
30 Fig. 5A is a physical noap of the B. subtiiis biotin operon. 

Fig. 5B is a map showing transcription of the B, subtiiis biotin operon. 

Fig 6 shows the restriction enzyme sites and complementation results for pBIO plasmids. 

Fig. 7 shows the complementation results with deletions and subclones of pBIO201. 

Fig 8 is a physical map of the B. subtiiis bio promoter region. 
,35 Fig, 9 shows the location and B«o phenotype of cat (chloramphenicoi-acetyi transferase) inseaionai 
mutations within B. subtiiis bioW, ORF2, and ORF3. 

Fig. 10 shows the location and Bio phenotype of cat insertional mutations within B. subtiiis bio 
promoter region, ORF4-5. and ORF6. 

Fig 11 is a comparison of the nucleotide sequences of the B. sphaericus bioDAYB regulatory region 
40 (SEQ ID NO:8) and the B. subtiiis bio promoter region (SEQ ID NO:9). 

Fig. 12 is an illustration of the restriction sites introduced for the 5' biotin operon cassette 1. upstream 
homology, terminator, 2. Promoter, operator, leader, 3. Ribosome-binding site, start codon, S'-btoW. 

Fig. 13A shows the orientation and sequence of the following PCR primers for the 5'-£>/o cassette: 
QRF4 1 (SEQ ID NO:10). BIQL5' (SEQ ID NO:11), Leaden (SEQ ID NO. 12), ANEB1224 (SEQ ID NO 13), 
45 BIOLli (SEQ ID NO:14), and BIOL4 (SEQ ID N015). 

Fig 13B is diagram of the construction of pBIOl44 

Fig, 14 shows the DNA sequence of the B. subtiiis biotin operon and its flanking sequences (SEQ ID 
NO 1) 

Fig. 15 shows the vitamer spectrum cf various Fermentation broths, 
so Fig. 16 is an illustration of an in-frame deletion in bioW. 

Fig. 17A is an illustration of elements of the bioW cataboiite repression sequence. 
Fig. 17B is an illustration of the terminator region deleted between bioB and biol. 
Fig 18 is a graph showing the azelaic acid resistance cf strains PA3 and PA6 
Fig. 19 is a graph showing the azelaic acid resistance of strains BI535 and BI544. 
as It is highly desirable to develop an efficient system for producing high titres cf biot.n that can be 

economically used in a commercial process The present applicants have recognized that an improved 
system for biotin production can be developed using B. subtiiis. 
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9, subtiiis has several advantages over the use of other species Unlike fl. sphaericus, B, subtiiis s 
mgnly characterized for the purposes of genetic engineering, increasing the ease by which one can a) 
develop a mutant strain optimized for bictin production; and b) manipulate and construct genet, c vectors 
:af'-ying genes encoding the biotin biosynthet'C enzymes. Most importantly, applicants disclose herem that 

« most or all of the genes encoding the biotin biosynthetic enzymes are found on a single operon. This 
makes it easier to generally regulate the expression of the operon, and to co-regulate the amount of each 
enzyme expressed. Furthermore, B. subtiiis contains a unique cytochrome-450-like enzyme That is involved 
in vitamor production and can be manipulated to significantly enhance vitamer production Neither the gone 
{bicl) encoding this enzyme nor any homologue of it from other organisms has been reported to piay a role 

io in biotin or biotin vitamer synthesis. 

Obtaining the genes for the B. subtiiis biotin opeicn was not a straightforward task. Preliminary 
attempts to identify the genes based on sequence similarity to B. sphaericus failed. The reason for the 
failure is that, despite their common taxonomic grouping as Bacillus, B. subtiiis and S. sphaericus are 
quite divergent species {Stackebrant et al. supra). Consequently the sequence homology between the 

ib relevant B. sphaericus genes and corresponding B. subtiiis gonos was too low to permit cloning of the B 
subtiiis genes using the B. sphaericus genes as probes. 

Applicants have therefore used an alternative and more successful strategy to clone the gerjes required 
for bictin biosynthesis {bio genes). This approach included complementation experiments with E. coii 
mutants in bioA, bioB, bicC, bioD, bioF, and bicH, and further characterization by marker-rescue and 

20 complementation experiments with known B. subtiiis biotin mutants in bioA, bioB and bioF (Pai et al 
supra). These experiments showed that in B. subtiiis all six of these biotin biosynthetic genes are contained 
on a single DNA fragment of approximately 8 kb. A detailed restriction map of this fragment has been 
obtained, and an analysis of overlapping clones, deletion mutants, subclones, and their respective 
nucleotide sequences allowed the genes to be located on the DNA fragment in the order, from right to left, 

25 bioW. bioA. bioF, bioD, bioB, biol, and ORF2. All seven genes are transcribed in the same direction, 
compatible with their being part of a single operon. 

The isolated htotin operon of B. subtiiis was then inserted into a microbial host for the production nf 
DioTin. The operon and the microbial host can each be, or can separately be, deregulated *or biotin 
production in order to provide a maximal level of bictin production. 

do 

EXAMPLE I. Cloning the B. subtiiis genes for biotin biosynthesis. 

Applicants have cloned and characterized B. subtiiis genes required for biotin biosynthesis [bio genes) 
Since prior to this work all that was known concorning B. subtiiis bio genes was that mutations .n bio A, 

35 bioB and bioF existed and were closely linked on the chromosome (Pai et al.. supra), two different 
approaches were originally taken to clone these genes. The first approach involved testing of short (-45-60 
bps) probes designed according to conserved sequences, and larger probes (-1 kb) generated by the 
polymerase chain reaction (PCR) from B, sphaericus bio genes. However, these probes failed to hybridize 
specifically to chromosomal digests of B. subtiiis DNA. A second approach involved screening libraries of 

40 B. subtiiis DNA for recombinant clones that complement E. coii bio mutants. 

IA: Attempts to done the bio genes by DNA hybridization with B. sphaericus sequences. 

To identify restriction fragments of B. subtiiis that contained bio genes, short {-45-61 bps) probes to 
45 internal regions of the bioA, bioB, and bioF genes of B. sphaericus were prepared. The sequences of the 
orobes were chosen based on conserved ammo acids predicted from the bio DNA nucleotide sequences of 
E. coii and B. sphaericus. 

The characteristics of these probes were as follows: bioA, 60-mer and bioB, 48-mer (nucleotides 
#1950-2009 and #3333-3380, respectively, from B. sphaericus sequence, GenBank™ accession #M29292), 
so bioF, 45-mer (nucleotides #2877-2921 from B. sphaericus sequence, GenBank Tu accession #M29291). 
Two of the probes did not hybridize [bioB probe) or hybridized poorly [bioF probes) to varicus 
chromosomal digests of B. subtiiis DNA even when the stringency of the hybridization conditions was low 
(5> SSC, 10% formamide, 1x Denhardt's solution, 1 00 ug/ml single-stranded salmon sperm DNA, 37*0) 
Only the bioA probe was able to hybridize. However, purified DNA fragments identified by bioA hybridiza- 
t on failed to marker-rescue the B. subtiiis bioA mutant indicating that the fragments did not contain the 
bioA gene. Furthermore, the DNA hybrids were unstable, the probe could be washed off the filters under 
conditions of moderate stringency (0.25 -0.1 x SSC 37* C). Similarly, larger DNA probes (~1kb) of the three 
bio genes, which were prepared by PCR amplification of B. sphaericus chromosomal DNA, also fa-ied to 
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hybridize specifically with B. subtiiis chromosomal DNA Consequently these probes ccuid not bft used to 
screen gene banks for recombinant clones containing B. subtiiis bio genes 

IB. Cloning the bio genes by complementation of E. coli mutants 

5 

A library of random B. subtiiis fragments (~ 1 0kb) was constructed in an E. colt vector using the pcsitive 
selection ,ector pTR264 [Lauer et al., J. Bact 173, 5047-5053 (1991)]. pTR264 is a pBR322 derived vector 
carrying an amp.cillin resistance gene and the x repressor gone as well as the gene for tetracycline 
resistance under the control of the regulatory sequences subject to regulation by the x repressor. pTR264 

w was constructed by reconstituting the ampicillin resistance gene of pTR262 by adding back the 5' end of 
the gene from the P$t\ site. An unique Bcl\ site is located within the x repressor gene. Insertion cf DNA 
fragments into this site disrupts repressor function, thereby relieving repression of the tet gene. Clones with 
inserts are selected by plating transformants on tetracycline plates. 

pTR264, isolated from a dam~ E, coli strain and digested with Bcl\, was ligated with B. subtiiis 

ib chromosomal DNA which had boon partially digested with SauSA and fractionated on a sucrose gradient i'8- 
12 Kb Fragments). The library (labelled BSBI) contained Tet f plasmids that complemented all the known E. 
coli bio point mutations. E. coli biotin mutants R879 (bioA24), R875 (5/0617), R878 (o/oC23). R877 
(6/0019), R872 (C/0F3), and BM7086 {AmalA-biohf) [Cleary and Campbell, J, Bact. n2, 830-839 (1972); 
Hatfield et al., J. Bact. 98, 559-567 (1969)] were transformed with the BSB1 library. Plasmids were isolated 

po from each Bio + transforTnant Plasmids pBIOiOO and pBIOiOi were isolated by complementation of R879 
ibioAy plasmids pBlO102 and pBIO103 by complementation of R877 {bioD); plasmid pBlOl04 by 
complementation of R872 (bioF): plasmids pBIOl09 and pBIOl 10 by complementation of BM7086 (bioH); 
and plasmids pBI0111 and pBI0112 by complementation of R878 (bioC) (Fig. 6). Finally, DNA from the 
BSB1 library was transformed into f. coli BM4062 [birA(\s)} [Barker and Campbell, J, Med, Biol. 146, 469- 

25 492 (1 981 )]. Plasmids pBIOt 1 3 and pBIOl 1 4 were isolated from eclcnies that grew at 42 0 C. 

Initial restriction analysis of the isolated plasmids indicated significant overlap of the cloned DNA 
fragments, suggesting that the five genes bioA, bioC, bioD, bioF and bioH are clustered in B. subtiiis .n a 
single operon. However, the birA complementing plasmids pBl0 1 1 3 and pBIOl 14 did not overlap with tne 
other five fragments. 

JO 

IC: Restriction mapping of the bio plasmids 

A restriction map of the bio locus from unique EcoRI to eamHI sites is shown in Fig. 6. The EcoRI to 
BamH! fragment cloned into a derivative of pBR322 was called pBIO20i The detailed restriction map of the 
35 9.9 kb fragment in pBlO201 was obtained by standard single and double enzyme digestion analysis. 

pBIOiOO. the first clone of bio genes isolated by complementation in E. coli. extended an additional 
300 tp beyond the SamHI site at one end. pB10110, isolated by complementation of bioH mutants of E. 
coli. extended about 1100 base pairs beyond the EcoRI site at the other end. Southern hybridization 
studies indicated that the insert DNA of pBIOiOO was derived from a single continuous segment cf the B. 
40 subtiiis chromosome. 

ID: Complementation/marker rescue of B. subtiiis and E. coli bio mutants with pBIO plasmids. 

To confirm that the cloned DNA of pBIOiOO contained B. subtiiis bio genes, pBIOiOO was tes:ed for 
45 the ability to marker-rescue B, subtiiis bio mutants. The plasmid restored biotin prototrophy to bicA, bioB, 
and bioF mutants at high frequency, indicating that the cloned DNA contained all or part of each of these B. 
subtiiis bio genes. 

The pBIO plasmids were also examined for their ability to complement £. coli bio mutants bioA. bioD, 
bioF, bioC and bioH. Most plasmids complemented mere than one E. coli biotin mutation. The isolate 
50 pBlOH2 complemented E, coii mutations in bioA, bioB. bioC, bioD, bioF and bioH (Fig. 6). PBI0112 did 
net complement the E. coli btrA{t$) or the A(gal-uvrB) mutation. These data demonstrated that most cf the 
known biotin genes are in a single cluster in B. subtiiis. 

Several deletions in pBIO20l were constructed by cutting at two mapped restriction sites filling in 
cvernangs if necessary, and ligating. After the structure of the deleted plasmids was confirmed, each v/as 
transformed into various E. coli bio mutants and complementation was scored, The results are summarized 
in Fig. 7 The deletion derivative pBIO203 was found to complement six cf the known E. coli tictin genes 
\bioA, bioB. bioC, bioD, bioF, bioH), establishing that al! six of the genes were located in the 8 kb fragment 
cf DNA from 5amHl to Xho\. The removal cf 3 8 kb from the left of this fragment (pBl02Q4) eliminated the 
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ability to complement bioC and bioH mutants oBIO206 contained only the right most 2 5 kh of the biotin 
cluster and complemented only bioA and bioF mutants. The order based cn these observations and 
hybridization data was bioC, bioH, (bioB, bioD), bioF, bioA, 

IE: Cloning of a B, subtilis fragment containing a complete bioW. 

DNA sequencing (see below) revealed that the promoter of the bio operGn was not present on any of 
the originally cloned DNA fragments. However, this promoter region was recovered by chromosomal 
walking. None of the clones originally isolated by complementation of E. coii mutants had extended f urine' 
to the right than had pBIOlOO. This was surprising since bioA sat at the rightmost end of these clones and 
fragments in the 8-10 kb range had been selected for cloning. However, DNA sequences of the rightmost 
end or the cloned insert of pBIOlOO revealed about 300bp of an open reading frame that was somewhat 
similar to B. sphaericus bioW Fragments containing bioA and the adjacent upstream region were cloned in 
E. coii bioA cells containing a pcnB mutation to reduce plasmid copy number [Lopilato et at., Mol. Gen. 
Genet. 205. 285-290 (1986)]. Under such conditions a Pst\ fragment containing an additional 2.7 kb of DNA 
upstream of bioA was cloned, and the location of the beginning of the bio operon was determining by DNA 
sequencing. 

The pcnB mutation in E, coii results in low copy number maintenance cf ColEi -derived plasmids, 
including pBR322 and pUC derivatives (Lopilato et al. t 1986. supra). An E. coii strain, BI259, that contained 
both the bioA and pcnB mutations, was constructed. Restriction enzyme, deletion, and Southern analyses 
had shown that a 5.5 kb Pst\ fragment would contain a complete bioA gene. B. subtilis GP275 
chromosomal DNA was digested with Pst\ and size fractionated by agarose gel electrophoresis. A pool of 

4.4 to 6 6 kb fragments was ligated into a pBR322-derived plasmid and used to transform BI259, Selection 
was for ampicilhn resistance and biotin prototrophy. A plasmid, pBlOl 16, was recovered that contained a 

5.5 kb insert. This plasmid could transform BI259 to biotin prototrophy at high frequency but could not 
transform R879 {bio A, pcnB*) to either biotin prototrophy or ampicillin resistance. Southern hybridization 
with a probe (a 600 bp Pst\-BamH\ fragment of pBIOlOO) containing the 300 bp that was somewhat similar 
to B. sphaericus bioW was used to confirm that the cloned DNA contained the bioW homolog. 

PBI0116 was available in very limited quantities from the pcnB background. The pcnB80 allele which 
was used in this cloning experiment is reported to reduce the copy-number of pBR322 replicons to about 
6% ot wild-type level (Lopilato et al., 1986, supra). To improve plasmid yields without impairing plasmid 
stability, the DNA was cloned in a low copy-number plasmid. The unique Sa/nHI site within the 3' end of 
bioW was used to subclone a 3.0 kb BamH\-Pst\ fragment from pBlOl 16 into pCL1921. pCL1921 is a 
derivative of the low-copy number plasmid pSClOl that contains the /acZ'polylinkor cloning region of 
PUC19 and a selectable specttnomycin/streptomycin resistance gene (Lerner and inouye, Nuc. Acids. Res. 
18:4631, 1990). pCH921 has a copy number of about 5-10 copies per cell. Purified 3.0 kb BamH\-Pst\ DNA 
from pBIO1 16 was ligated to BamH\ and Ps f-cut pCL1921 DNA and the ligated DNA was transformed into a 
pcn&~ E, coii strain, MM294, selecting for spectinomycin-resistance (100 ug/ml). A plasmid, pBlO350, was 
recovered that contained the correct 3 0 kb BamH\-P$t\ fragment. The quantity of pBIO350 recovered from 
This strain was significantly higher, without loss of plasmid stability, compared to pB!0116 isolated form the 
pcnB80 strain. 

EXAMPLE II. DNA sequencing of the B. subtilis bio gene cluster. 

To further identify the bio biosynthetic genes, to understand the regulatory apparatus controlling their 
expression, and to locate sites appropriate for genetic engineering, the B. subtilis bio genes contained on 
denes pBIOlOO and pBIO350 were sequenced using the Sanger dideoxy sequencing method using 
Sequenase™ kits, version 2.0 (United States Biochemicals, Cleveland, OH, USA) as instructed by the 
manufacturer. 

HA: DNA sequencing strategy 

The strategy used to obtain the DNA sequence of the 8-10 kb region that included the B. subtilis bio 
genes was to divide the region into four plasmid subclones of approximately equal size, and then make 
nested sets of deletions progressing through each subclone. To generate the nested deletions the 
"excnuclease III - endonuclease S1" method was used, the reagents were purchased in a kit (Pro-mega. 
Madison, Wl, USA). Nested deletions were made from both ends for three of the subclones and from one 
end fcr the fourth Sequencing bcth sets of nested deletions for three of the subclones gave the sequence 
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cf both strands of each subclone, which is necessary to obtain a completely accurate sequence For 
pBlO350 one strand was determined similarly and the opposite strand was determined by synthesizing 
sequencing primers at intervals of approximately 150 bp. The junctions between non-overlapping subclones 
were confirmed by sequencing from synthetic pnmers using pBIO20l or pBIO100 (or subclones thereof) as 
5 a template. The sequences were aligned and compared with the DIMASTAR computer program (DNASTAR, 
Inc , Madison, Wl USA) 

MB- identification and organization of bio-specific coding regions and transcriptional regu tatoy 
signals. 

w 

Analysis of -8500 bp of the DNA sequence from pBIO100 and pBIO350 indicated a single bio cpe-'on 
containing seven coding regions (Fig. 5A, Fig. 14, and SEQ ID NO:1). Starting at the 5* end of the bio 
operon in pBIO350 and continuing through pBIOlOO, one finds first a -100 bp region which contains a 
putative promoter sequence recognized by the vegetative form (a A ) of B. subtilis RNA polymerase (referred 

75 to as ? blQ ) next to a transcription regulator site that is defined by a strong sequence homology to the 
"operator" sites of the B. $phaericu$ bio operons. 

The nucleotide sequence of this putative promoter region is shown in Fig. 4. [Fig. 4 symbols are as 
fellows. Dashed lines: regions of dyad symmetry, Bold underline: similarity to the B. sphaencus bioDAYB 
regulatory site; a A :promoter region recognized by the vegetative form of B. subtilis RNA polymerase; RBS: 

?o nbosome binding site; *: restriction site blocked by dam methylation. Deduced amino acid sequences are 
shown below the nucleotide sequence.] The sequence of P bl0 is TTGACA 17bp -- TATATT (SEQ ID 
NO:2) and is in good agreement with the B. subtilis a A consensus sequence, TTGACA -- 17/18 bp -- 
TATAAT (SEQ ID NO. 3). This region is immediately followed by an ORF (open reading frame) with 
homology to bioW (259 amino acids), followed by ORFs with homology to bioA (448 amino acids), bioF - 

25 (389 amino acids), btoD (231 amino acids), and bioB (335 amino acids). The next two open-reading frames 
ORF1 (bid: 395 amino acids) and ORF2 (253 ammo acids) showed no sequence similarity to any known 
bio gene (Fig 5A) The positions of the promoter, genes and putative transcription termination sites am 
summarized in Table 1. 

30 Tabl« 1. Sumaary of G«n«i, Pronot»r», *nd R«?ulatory Il««nt* in th* B . 

mubtili* biotin op«ron . 

[Refer to SEQ ID NO:l and Fig. 14 for rvjmbering of bases) 
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The location and orientation oi promoters, transcriptional termination sites, and n'oosome binding sites, 
are indicative of a single operon containing the bio genes. Each gene is preceded by a strong Bacillus 
nbosome binding site (RBS), with calculated AG's ranging from -i i 6 to -20 4 kcal/mol Ail genes are 
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oriented in the same transcriptional direction (right to left) In addition, the 5' ends of bioA, biof bioD, and 
bioB overlapped the 3' ends of each preceding gene suggesting that expression of these genes is 
regulated, in part, by translation^ coupling, bioi and ORF2 are separated from the previous gene by 68 and 
67 case pax intercistronic regions, respectively, indicating that they are not translationally coupled The 

5 bioW gene appears to be the first gene in the operon as it is preceded by the potential RNA polymerase 
(cA> promoter site and putative operator site discussed above This promoter region represents the 
beginning of the bio operon, since approximately 50 bp upstream from is a stem-loop structure that 
resembles a rnomdopendent transcripton termination site This putative termination site represents the end 
of a separate transcription unit since it is in turn preceded by a coding region with a strong Bacillus 

w ribosome binding site (RBS; AG = -14.8 kcal/mol) labeled ORF4-5 (299 amino acids), which is oriented in 
the same direction as the bio operon (Fig. 8). Finally, further upstream from ORF4-5, cnented in the 
opposite direction, there is a strong Bacillus RBS (AG = -17.4 kcal/mol) followed by the first 266 amino 
acids of another open reading frame, ORF6; the remainder of ORF6 continues beyond the P$t\ site The 
deduced amino acid sequence of ORF6 showed significant similarity to a number of regulatory proteins 

ib related to the lac\ repressor: E. coii ebgR (a repressor of a cryptic operon of unknown function). E. coll 
purR (a repressor of the purine nucleotide biosynthetic operon), and E. coli cytR (a pleiotrophic transcrip- 
tional repressor of deoCABD, udp, and edd encoding catabolizing enzymes and nupC, nupG, and tsx 
encoding transport and pore-forming proteins). Transcription of ORF4-5 and ORF6 may be coordinated 
since we detect two overlapping a A promoter sequences within the 175 bp gap between the 5' ends of 

90 ORF4-5 and ORF6 (TTGTAA -I8bp - TAATAT (SEQ ID NO:4) - ORF6; TTGATA ~ 17bp -- AAAAGT 
(SEQ ID NO:5) — ORF4-5) and a series of inverted repeats. 

ORF2 is the last gene in the bio operon based on the presence of a stable stem-loop structure 
resembling a r/ioindependent transcription termination site immediately at the end of this coding region. A 
second stem-loop structure with terminator-like features was identified in the intercistronic region between 

25 bioB and bioi Several secondary structures of the mRNA are possible, with the most favored structure 
having a AG of formation of -1 1 kcal/mol and the least favored structure a AG of -5.6 kcal/mol. 

Downstream from the end of the biotin operon is a strong RBS (AG = 20.0 kcal/mol) and 260 amino 
acids of another coding region, ORF3. The remainder of ORF3 continues beyond the BsfXI site which 
marks the end of the sequenced region. The deduced amino acid sequence of ORF3 showed significant 

30 similarity to a number cf £. coli membrane-associated transport proteins: glycerol-3-phosphate permease 
[ugpE and ugpA); maltose permease (malG and malF); and molybdenum permease (chl). In particular, the 
partial ORF3 peptide contains a 20 amino acid sequence at the COOH-terminal region found common to all 
membrane-associated transport proteins. ORF3 is transcribed separately from the bio operon using a 
r itativo a A promoter sequence TAGACA-N' a -TACATT (SEQ ID NO:1; Fig 14, #7600-7629) 95 bps 

35 upstream of ORF3. 

The gene-enzyme relationships, the enzyme sizes, and percent homology to the same enzyme f:om 
other organisms are summarized in Table 2. 

Complementation studies using plasmid subclones that contained either bioi or ORF2 alone under the 
transcriptional control of the lac promoter indicate that bioi alone is sufficient to complement either a bioC 

40 or bioH mutation of £. coli Copies of bioi and ORF2 were generated by PCR. A Hind\\\ site was introduced 
at the 5' end of each gone, a BamH\ site was introduced at the 3' end of bioi and an Asp718\ site was 
introduced at the 3' end of ORF2. The PCR generated fragments were each cloned into three piasmids with 
different copy number; the low copy number plasmid pCH921, a medium copy number plasmid, pJGP44 
(which is derived from pBR322), and the high copy number plasmid, pUCi9. In two of these recombinant 

45 piasmids expression of bioi and ORF2 is under the control of the lac promoter (pCR1921 and pUC19) 

Piasmids containing bioi complemented both E. coli BM7086 (bbioH) and E. coli R878 {bioC) 
Piasmids containing ORF2 did not give normal complementation of either E. coii BM7086 or R878 It was 
clear from these experiments that the product of the bioi of B. subtilis is able to supply an activity needed 
For biotin synthesis that can substitute for, or overcome, the activity missing in either btoC or bioH mutants 

so of E. coli. 

A plasmid (pBIO403) containing only the B. subtilts btoW gene and its promoter cloned into pCH92l 
(Lerner and Inouye, 1990, supra), complemented both E, coii AbioH and bioC mutants, if and only if pimelic 
acid was added to the medium at aboaut 30 mg/1 This experiment confirmed that bioW encodes a pimelyl- 
CoA synthetase that can bypass bioH and bioC in E. coli. 

5b 
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No significant similarity was detected between the deduced amino acid sequence of either B. subtilis 
$b biof or 0RF2 and the protein sequences of E, coli bioC or bioH genes or other bio genes. Subsequent 
comparison to the protein database of GenBank™, however, indicated significant similarity of bioi to a 
number of cytochrome P-450 enzymes from B. megaterium (BM-1), S. erythraea (eryF and ery/0, S, 
griseolus (suaC and subC), S. species strain SA-COO {choP) t and ether organisms Cytochrome P-450s 
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are a class of enzymes that include monooxygenases that are kncwn to catalyze hydroxylation of manv 
different kinds of substrates, including fatty acids. Since synthesis of pimehc acid, a precursor tc cictm, 
might involve hydroxylation and.-or further oxidation of an unidentified fatty add, biol may be involved in an 
early step in biotin synthesis. The bioC and/or bioH genes are functionally equivalent to biol, based on the 
5 ability of biol to complement bioC and/or bioH mutations. Similar comparative studies revealed weak 
similarity of ORF2 to the £-ketoreductase domain cf polyketide synthase II (ery Alt), which is involved in an 
early enzymatic step in erythromycin formation. 

EXAMPLE III: cat insertional mutagenesis of the bio operon and flanking coding regions. 

10 

To verify the boundaries of the bio operon predicted from the nucleotide sequence and to confirm the 
role of previously unidentified bio genes, a cat cassette was used to construct insertions or deletions in 
bioW biol, ORF2, the bio promoter region QRF3, ORF4-5, and ORF6 (located outside the predicted 
boundaries of the bio operon). The cat cassette includes a chloramphenicol resistance gene. To make the 

7^ above-mentioned constructions, plasmid derivatives containing these mutations were first constructed in E 
coii. The cat insertions were then transferred to the bio chromosomal locus of B. subtilis by DNA 
transformation using standard procedures. To determine whether the insertions or deletions inactivated 
biotin synthesis, colonies containing these mutations were assessed for growth on biotin-free medium agar 
plates with or without the presence of biotin (Bio phenotype). 

90 As diagrammed in the top of Fig 9 and Fig 10, the cat cassette was inserted by ligation into the 
coding regions of bioW using a BamH\ site; into biol using a Sma\ site; into ORF3 using a Xmn\ site; into 
ORF6 using an EcoRV site; between the pair of Sst\ sites deleting the 3' end of ORF2; or between the pair 
of BstB\ sites deleting ORF4-5. The cat cassette was also used to either disrupt the bio promoter region by 
!. gating it into the Eco47lll site, or used to entirely replace this promoter region by ligating it between the 

25 Hpa\ sites. In each of the ORF2/Ssfl, ORF4-5/B$fBI, and !Eco47\\\ constructions, the cat gene was inserted 
only in the same direction as the disrupted coding region or promoter region. In all other constructions, two 
different plasmid derivatives were generated where the cat cassette was inserted in either possible 
orientation. Each cf these mutations was then integrated into the bio locus by first linearizing the car- 
containing plasmid by a restriction enzyme cut outside of the bio DNA and transforming this cut DNA into a 

10 competent prototrophic B. subtilis strain. PY79. and selecting for chloramphenicd-reststance (Cm'), The Bio 
phenotype of each mutant is summarized at the bottom of Fig. 9 and Fig. 10. Insertions within the coding 
regions located outside cf the predicted bio operon, ORF3, ORF4-5 and OPF6, generated Cm f prototrophic 
colonies, indicating that these mutations had no phenotype with respect to biotin production and with 
respect to auxotrophy Insertions within the bio operon gave complex results that generally supported the 

35 nucleotide sequence data. Interruption of the bio promoter region with the cat gene oriented in the opposite 
direction relative to the biotin operon, and interruption of bio W with the cat gene oriented in either direction 
relative to the bio operon, generated an unambiguous Bio" phenotype, confirming the location of these 
sequences at the 5' end of the bio operon/promoter region. However, replacement of the P b ,- C promoter 
region with the cat gene inserted in the same transcriptional direction as the biotin opercn generated 

40 Bio + bacteria at a low frequency (0.1%). Bioassay experiments indicated that bacterial biotin vitamer 
production was increased in the presence of low concentrations of chloramphcnaccl, suggesting that 
transcription of the biotin operon was under the control of the cat promoter. The 3* end of the operon could 
not be definitively identified by this genetic method. Insertions within biol resulted in CnY colonies that weie 
oa'tially deficient in biotin production, i.e., grew poorly on biotin-free medium but grew to wtid-type levels m 

46 the presence cf biotin (33 ug/mt), whereas the ORF2::caf mutation generated Bio* colonies. These results 
suggested that bid is not absolutely required for biotin production, and the ORF2 gene product appeared to 
ne dispensable for wild-type growth in the absence of exogenous biotin. No significant effect on biotin 
production was detected in a birA strain (e.g., BI421; see Example XB) containing the ORF2::catn mutation. 
Nevertheless, it is stin possible that ORF2 may be required for overproduction of biotin. 

bo The partial biotm-deficient phenotype generated by the biol::cat mutation, designated as Bio* \ 
appeared to be caused by inactivation of biol rather than by a polar effect because mutations within the 
downstream genes ORF2 or ORF3 were Bio + . To determine whether the Bio + / phenotype was genuine and 
to venfy that the biol gene product was involved in formation of pimelic acid, the biol" cat mutation was 
bypassed by feeding pimelic acid. As summarized in Fig. 9, strains of PY79 containing this mutation in 

bs either orientation of the cat gene grew to wild-type levels on biotin-free medium containing pimelic acd (33 
iiq mi). These results confirm that the biol gene product is involved in early biotin formation and that 
:r,act:vation of this product only partially disrupts biotin production. 
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EXAMPLE IV: Analysis of the regulatory mechanism of the blotln operon. 

Transcription of the divergent E. coll bio operon bioABFCD is regulated by a classical repres- 
ser operator mechanise, involving a repressor encoded by the OirA locus [Cronan, Cell 58, 427-429 (1 989>- 

5 ]. This repressor is a Afunctional molecule carrying the holoenzyme synthetase activity at its COOH- 
terminal end, an activity which converts biotin into biotinoyl-AMP, an adenylated form of fcictm. before 
transferring it to the apocarboxylase enzyme. Biotinoyl-AMP also functions as the co-repressor, the 
repressor, hiotinoyl- AMP complex blocking transcription by binding to an operator site that overlaps the -10 
regions of two divergent promoters. 

iv The 5' promoter and regulatory region of the wildtype bio operon was characterized in order to leplace 

it wtn one of several strong and constitutive B. subtiiis promoters (see Example VII). The mcst lineiy site- 
tor initiation of transcription of the B. subtiiis bio operon is a o A promoter, P b , 0 , approximately 84 op 
upstream from bioW. the first gene in the operon (Fig. 4) The actual mRNA start site is either one of the 
two adenosine nucleotides, 3-4 bp downstream from the end of the "TATATT" box, or the guanosme 7 bp 

?:> from the "TATATT" box. The RNA leader sequence contains a 33 bp segment with strong sequence 
homology to the "operator" sites of the B. sphaericus bio operons. Comparison of the nucleotide 
sequences of this region to the 5' non-coding region of the B, sphaericus bioDAYB operon is shown in Fig. 
11 [Fig- n symbols are as follows. Upper sequence {B. sphaericus bioDAYB regulatory region) 15 bp 
regulatory region: double bold underline; region of dyad symmetry: dashed line; start site of transcription 

?o determined by primer extension mapping: arrows; nbosome binding site- RES; The "G" and "T" 
nucleotices were displaced to facilitate sequence alignment. Lower sequence (B, subtiiis bio promote' 
region) 13 bp putative regulatory region based on similarity to the B. sphaericus regulatory region in the 
upper sequence: single bold underline, putative start sites of transcription, arrows, ribosome binding site. 
RBS; The "C" nucleotide was displaced to facilitate sequence alignment.] 

25 The majority of conserved nucleotides are clustered at two sites (13 and 11 bp) separated by a 9 bp 
segment. This finding suggests that transcription of the B. subtiiis bio genes is regulated by a repres- 
ser/operator mechanism, possibly involving a birA-Kke gene located near the trp operon (see Example IX) 
Tne activity and regulation of a promoter upstream of bioW has been verified by showing that a 
translaticnal iacZ fusion to bioW, which included P blc and the putative regulatory region, displays biotm- 

30 regulated expression of ^-galactosidase (see Example V). 

The 5' RNA leader contains a potential large stable stem-loop structure (AG = -14.0 kcai.'mol) that 
overlaps the oeprator region. 

Based on the identification of P t/o and a regulatory site upstream from bioW, the B. subtiiis bio genes 
arc transcribed as a single polycistronic message of approximately 7200 bp In addition, there also exists 

;)b possible secondary promoter sites located within the internal regions of the bio operon. For example, a 
sequence. TTGAAA - 17 bp - TCTTAT (SEQ ID NO:6 and Fig. 14, #6258 to 6286), with some similarity to 
a consensus o A promoter sequence, is located within biol (approximately 775 bp downstream from the start 
codon ot bio!). Determination of whether these sequences function as internal promoters can be achieved 
by using restriction fragments from internal portions of the bio operon to construct translational IacZ fusions 

40 isee Example V). Optimization of biotin production can be achieved by modifying one or more of the 
primary cr secondary promoters. 

IVA. Construction and analysis of bio-lacZ translationai fusions, 

4b A translational IacZ fusion was constructed to confirm the activity and regulation of the putative 
promoter and regulatory region, and to assess the relative level of expression of the B. subtiiis biotin 
operon m a variety of contexts. This was accomplished by inserting a 3.1 kb BamH\-Bgf\\ fragment 
containing a promoter-less IacZ coding region into the BamH\ site of pBIO350, to give pBI0397 and 
pBI0398. These two presumably identical plasmids contain an in frame "translational" fusion between bioW 

so and IacZ on a low copy number plasmid. pBl0397 and pBI0398 turn a iacZ~ E. coll pale blue on X-gal 
indicator plates, suggesting that the fusion is expressed at a relatively low level in E. coii. 

To test the bioW- IacZ fusion for biotin-regulated expression in B. subtiiis, Bio' partial dip'oids were 
constructed The cat cassette was cloned into pBI0397 using the single $ma\ site located within the 
potyiinker region downstream from the IacZ gene. One recombinant plasmid, pBI0397caf, with the cat 
gene oriented in the same direction as the translational fusion, was used to generate Bio + partial diploids. 
Competent cells of the prototrophic B. subtiiis strain PY79 were transformed with pBI0397caf plasmid DNA 
selecting for Cm/ Transformants should only arise by recombination into the chromosome. Th:s leaves an 
mtact copy of the bio cperon, allowing activity of the lac fusion to be assessed in the absence of added 
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biotin. 

The promoter of the BamH\ to P$t\ fragment cloned in pB10350 is regulated by biotin. Two CnV Bier 
partial diploids containing the bioW-facZ fusion were tested for tf-galactosidase activity in biotm-free 
medium in the presence or absence of bictin {100 ug/liter). In liquid ONPG assays, both strains were 
5 regulated specifically by biotin. The level of £-galactosidase was repressed in the presence of biotin and 
induced in its absence to 24 to 85-fold higher specific activity (Table 3). This strain with the bioW::l$cZ 
fusion integrated at the bio operon has also been used to isolate new biotin analog-resistant mutants (see 
bolow) on X-gal indicator plates. 

w Table 3 



Biotirwegulated expression of £>/oW- /acZ translational fusion. 


Strain 


OD 6 oo 


/3-gaiactosidase activity (Miller Units) 3 






+ Biotin (100 ug/l) 


-Biotin 


Fold Increase 


PY79(£>/oVV-/acZ)l2A 


Late Log (0.7-0.8) 


0.47 


11.30 


24 




Stationary (1.1-1.4) 


0.17 


11.00 


65 




Late Stationary (1.4-1.6) 


0.13 


6.85 


53 


PY79(£>/OlY-/acZ)14A 


Late Log (0.7-0.8) 


0.64 


21.10 


33 




Stationary (1.1-1,4) 


0.15 


12.30 


82 




Late Stationary (1.4-1.6) 


0.10 


8.45 


85 


PY79(SP/3::b/0W/-/acZ)/#1 (1.04-1.08) 


0.04 


0.60 


15 


P779(SPj9::6/oW-/ac2)/#3 (0.988-0.960) 


0,06 


0.49 


9 


BI42l(SP/3::6/0W r -/ac2)/#1 (1 00-0936) 


067 


1 00 


1 5 


BI421{SP/3: £)/0W-/acZ)/*3 (1.07-0.92) 


096 


0.86 


0 9 


a-DB9(SP0.:&/oW-/acZ)/#1 (0.892-0.528) 


0.18 


1.65 


9.2 


a-DB9(SPj9::b/olV- /acZ)/#3 (1 .344-1 .200) 


0.11 


1.35 


12.3 


HB3(SP0::6/oW-/acZ)/#1 (1.068-1.076) 


0.24 


0.32 


1.3 


HB3($P 8"t)i0W-lacZ)/#3 (1.213-1.320) 


0.18 


0.15 


0.8 



a Average of two measurements, expect for a-DB9(SP $::bio W- lacZ)IUA (one measurement). 



IVB. Construction and analysis of an SPtf-borne biow-lacZ translational fusion. 

•o - 

Other strains, in addition to the iacZ fusion strain of Example IVA, can be constructed for deciphering 
biotm regulatory mechanisms. For example, Applicants recognized that insertion of the plasmid borne 
bioW-facZ fusion into the chromosome causes technical problems because the integrated ptasnmd 
amplifies in copy number. Thus the expression of the bioW-lacZ fusion at single copy-number and at a 
site distinct from the bio operon was tested by introducing the fusion into a modified $P£ specialized 
transducing phage [see, e.g., Zuber et al., J, Bacteriol. 169, 2223-2230 (1987)]. 

Two isolates of PY79 {$P0::bioW-!acZ) and BI421 (SP&wbtoW-lacZ) were assayed for 0-galac- 
tosidase activity as described above. The results are summarized in Table 3. 

In the Bio' strain PY79, expression of SP &::bioW-facZ was very low, but showed biotin-specific 
regulation. £-galactosidase was repressed in the presence of biotin and induced approximately 9-15 fold in 
the absence of biotin. Comparison to earlier assays of PY79 containing two or more copies of the plasmid- 
borne bioW-lacZ fusion indicated that the single-copy fusion produced three-fold less /3-gaiacactosidase 
under repressed growth conditions and about 20-fold less under derepressed growth conditions. In a B. 
subtilis birA strain (BI421; see Example XB) low-level constitutive expression of the fusion was observed 
The levels of /3-galactosidase in the birA strain were only slightly higher than the level observed in PY79 
[SP d'-bioW IacZ) grown under deroprcsscd growth conditions In one of the BI421 (SPtf' bioW-lacZ) 
isolates biotin was slightly repressed #-gaiactosidase expression, suggesting that this birA mutation may 
not completely relieve biotin regulation of the fusion. These results confirmed that expression of bioW is 



15 



EP 0 635 572 A2 



regulated by biotin 

Biotin regulation was also examined in two independently isolated biotin analog-resistant mutants. HB3 
contains a spontaneous birA mutation and a-DB9 contains an a-dehydrobiotm resistant mutation unlinked to 
either bio or bit A. These strains were transduced, grown, and assayed as described above, except that a- 
DL <$P$::b!0W-lacZ) was assayed during mid-exponential growth. As summarized in Table 3, the fusion- 
be; HB3 strains displayed low-level constitutive facZ expression Unlike the birA mutants, however. <_,- 
DB9 {SPpr.bioW- IscZ) displayed biotin-regulated expression of lacZ. 

EXAMPLE V: Secretion of biotin and biotin vitamers from E. coii strains containing 8. subtilis bio 
genes. 

During complementation experiments, Applicants observed that an E. coli bioA mutant containing 
PBIO201 cculd cross-feed the same strain containing the control plasmid pBR322, demonstrating that biotin 
synthesized by the pBlO201 containing strain was secreted into the media and metabolized by the Bio~ 
strain. Applicants therefore tested various newly constructed plasmids for the ability to secrete biotin and 
biotin vitamers ifUo the media. 

£, coli strains MM294 and JM109 /acl Q (both strains are wild-type for bio genes) were transformed with 
pBR322, pBIO20l. pUCl9, and pBI0289 (described in Example VI, below). The pBR322 and pBIO20i 
transformants were grown in minimal medium containing 2% glucose. The pUCl9 and pBl0289 transfor- 
mers were grown in a rich medium containing 2 0/ o glycerol since they did not grow well in liquid minimal 
medium. After 48 hours, ceMs were removed by centrifugation and any residual live cells were kiHed with 
chloroform. Supernatants were diluted serially in ten-fold diluted Difco Biotin Assay Medium supplemented 
with 0.5% glucose and 5 mg/l thiamine, and tested for support of growth of E. coli A(mdl-bioH) and E. colt 
MbioA-D), described below. Standard curves were prepared from serial dilutions of biotin and de- 
sthicbictin. The assay was sensitive to 1 ug/liter. 

The results from these assays are shown in Table 4. Strains containing plasmids encoding 8, subtilis 
bio genes secreted biotin, while strains containing control plasmids did not secrete biotin This dem- 
onstrated that the E. coli strains containing The B. subtilis biotin operon (pBI0289) are capaole of secreting 
enhanced levels of biotin and biotin precursors. 

Table 4 



Production of biotin and biotin vitamers* from E. coli strains containing B. subtilis bio genes. 


Strain 


Plasmid 


Biotin (ug/1) 


Total Biotin and Biotin 








Vitamers (ug/1)* 


MM294 


pBR322 


0 


3 




pBIO201 


10 


10 


JM109 


pUC!9 


0 


0 




pBI0289 


10 


10 


MM294 


pUC19 


0 


1 




pBl0289 


10 


100 



' Biotin vitamers are given as desthiobiotin equivalents. 



This assay can also be used to test the level of biotin, or biotin precursor, produced by other strains, 
e.g., B. subtilus strains, or other plasmids. Candidate strains are tested after being transformed with a 
plasmid bearing a functional biotin operon Alternatively, candidate plasmids are tested in a strain known to 
secrete biotin. 

EXAMPLE VI: Construction of a "minimal" bio subclone. 

The minimal subclone encodes all of the relevant functions of the original primary clones (eg , pBlOiOO 
and pBIO201). A "minimal" subclone was constructed to confirm the location of the bio genes derived from 
deletion mapping and DNA sequence information, and to confirm that tho open reading frame downstream 
from a transcription terminator to the right of an EcoRV site (see Fig. 5A) is not required for biotin 
biosynthesis. 
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An FcoRV (partial) to BamH\ fragment from pBIO20i (containing all of the ooen reading frames thought 
to be bio genes except the bioW gene) was inserted into the Sma\ to BamH\ backbone of pUC9 [Viera and 
Messing, Gene 19, 259-268 (1982)] to construct pBI0289. Since the bio genes of pB!0289 are all 
downstream from the iac promoter of pUC9, and since pUC9 is maintained at a higher copy-number than 

s pBR322 (the parent of pBIOlOO and pBlO201). pBI02B9 is expected to express the bio genes at a higher 
level than pBIO20i and pBIOlOO in an f. cofi host pBIOlOO, 201 , and 289, as well as pBR322 (as a 
control), were transformed into a series of isogenic E. coli bio mutants. Each transformant was tested f cr 
complcmcntation on medium lacking biotin The results arc shown in Tabic 4 pBI0289 complemented all 
mutants that were defective in a single bio gene, confirming that all relevant genes lie upstream of the 

w putative terminator. The open reading frame downstream from the terminator (ORF3) is not necessary fo; 
complementing any of the individual E. coli bto mutants. 

Table 5 



Complementation of various E. coli bio mutants by selected plasmids. 




Plasmid 


Mutation 


pBR322 


pBIOlOO 


PBIO201 


pBI0289 


none 


+ + + 


+ + + 


+ + + 


+ + + 


A{gal-uvrB) 










AbioA 




+ 


+ + 


+ + + 


btod 






+ + + 


+ + 


bioC23 




+ + + 


+ + + 


+ + + 


AbioD 








+ + 4 


A{mal-bioht) 




+ + + 


+ + + 


4- + + 



EXAMPLE VII. Construction of full length wild-type and engineered B. subtilis biotin operons. 

30 

Experiments to construct engineered, full length bio operons for integration and amplification *n the B. 
subtilis chromosome are described below. 

VIIA: Re -construction of a full length wild- type bio operon 

35 

Since it had proved necessary to clone the 5' end of the B. subtilis biotin operon at low copy number 
for work in E. coli, a low copy number plasmid was also used for construction of complete and engineered 
biotin operons for integration into the B. subtilis chromosome. As described above, the 5' enu of the B. 
subtilis bio operon was cloned as a 3 kb Pst\-BamH\ fragment in pCH921 (Lerner and Inouye. 1990. 

"° supra) a low copy number plasmid present at about 5-10 copies per cell, to give pBIO350. 

A full length biotin operon was then reconstructed by adding the 3' portion of the bio operon from 
OBIO201 to pBIO350 A 10 kb BamHI to fcoRl fragment from pBIO20i, that contains the majority of the 
biotin operon as well as about 3kb of downstream DNA, was ligated into pBlO350 that had been gapped 
with SamHI and EcoRI. Two resulting plasmids that had the correct anticipated structure were called 

* 5 pBlO400 and pBIO401. pBlO401 complements all known E. coli bio mutants, including a AbioA -D mutant. 
When pBIO401 was selected by complementation in a AbioA-D strain and then produced in a rich 
medium, it was stable enough to yield usable quantities of plasmid DNA. Since the spectinomycin- 
resistance gene carried by pBI0401 is not expressed in B. subtilis, a cat cassette was added at the EcoRI 
site in pBIO40i to allow for selection, integration and amplification in B. subtilis wild-type or deregulated 

50 strains by methods known to those skilled in the art, resulting in the plasmid pBlO401 car,. 

To determine the effect of increased bio copy-number on biotin production, pBIO40l cat s , which 
contained a single copy of the car cassette (resistant to chloramphenicol at 5 no/ml) and the entire bio 
operon, was integrated into the chromosomes of wild-type and biotin-deregulated strains of B. subtilis. The 
piasmid copy number was amplified by selection on 60 ug/ml chloramphenicol, and biotin production was 

"z b 

thereby increased. 
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VIIB: Construction of engineered bio operons 

In constructing engin9ered, deregulated biotin operons for integration into the 8, subtilis chromosome, 
it was useful to install unique restriction sites between transcriptional signals, regulatory sites and coding 
5 regions to allow easy introduction of alternate elements or alleles. Also, unique sites were used to flank the 
engineered biotin operon in these constructions, so as to remove E. ccV/'-derived vector sequences onor to 
integration. 

Applicants discovered that construction of an engineered B. subtilis bio operons with a strong, 
constitutive promoter was not a straight-forward task. It was not possible to maintain the entire B. subtilis 

w biotin operon on a single plasmid in E, coll, even a low-copy plasmid, when the operon was transcribed by 
a strong constitutive promoter, e.g., the SP01-15 or SP01-26 promoter. An alternative and novel strategy 
had to be developed to introduce an amplifiable DIMA fragment containing the entire engineered bio operon 
into the B. subtilis chromosome First, for cloning and engineering purposes the operon was manipulated in 
two separate pieces: 5' and 3' cassettes. Next, when the DNA engineering was completed, the relevant 

ib DNA fragments from the appropriate 5' and 3' cassettes were ligatcd and the ligatcd cassettes were 
transformed directly mto B. subtilis. The ligations were designed to deliver either circular or concatamenc 
molecules that would recombine with homologous sequences in the chromosome thereby insetting the 
engineered DNA in a manner that could be amplified, with or without accompanying vector sequences. 

Piasmids were constructed for use as backbone vectors for developing contructs that include an 

?o engineered bio operon. These piasmids were based on the low copy number vector pCLi920 (Lemer and 
Inouye, 1990, supra). The polylinker in pCL1920 was replaced with a polylinker flanked by Not\ sites. The 
lac promoter present in pCU920 was also removed for simplicity. The fragment containing the lac 
promoter and the polylinker was eliminated with EcoRI. The backbone, containing the pSC1Q1 origin of 
replication and the omega fragment encoding resistance to spectmomycin, was re-circularized in the 

25 presence of a Not\ linker to give pBIOi2i. Plasmid pJGP40 is a pBR322 derivative that contains a 
kanamycm resistance gene cloned into a polylinker flanked by Not\ sites. The Not\ fragment encoding 
kanamycin resistance from pJGP40 was cloned into the Not\ site of pBIOi2t to give pBIOi24. the two 
orientations being indicated by "a" and "b" (Fig. 12). Digestion of both pBIOl24 derivatives with Asp718l 
and religation eliminated the kanamycin resistance element, leaving a complete polylinker, to give piasmids 

30 pBIOl26a and pBI0126b. 

vii Bi: Engineering a 5 ! Cassette 

In considering which functional elements at the 5' end of the bio operon should be separated by unique 

35 restriction sites, the elements of most interest were 1) the putative stem-loop termination site upstream from 
the putative biotin promoter, 2) the putative promoter-operator-leader region, and 3) the ribosome binding 
site, initiation codon, and 5' coding region of bioW. Our strategy, as depicted in Fig. 12, was to introduce by 
PCR Hind\\\ and Sa/I sites 70 bp upstream from the terminator, which is upstream from the biotin protr.oter. 
To separate the terminator from the promoter-operator region the Cld\ site in this region was converted to a 

40 Xho\ site. Conversion of the £co47lll site, which precedes the ribosome binding site of bioW, to an Xba\ 
site separates the promotcr-opcrator-loader region from the ribosome binding site-start ccdon fragment. A 
oescr'ption of all the PCR primers used and their orientation is indicated in Fig. 13A. Table 6 lists the 
fiagments generated by PCR. A three way ligation with 1} PCR fragment E which introduces a 5' Hind\\\- 
Sa/I site and converts the £co47fll site to a Xba\ site, 2) PCR fragment B which converts the £co47li: site 

45 to a Xba\ site and extends to the BamH\ site in bio W and 3) a Htnd\\\~BamH\ digest of pBI0125A vector, 
resulted in plasmid pBIOi39 A second three way ligation with PCR fragments C, D, and Sal\ and Xba\ 
digested pBlOl39 converted the Cia\ site to a Xho\ site and completed the initial construction as plasmid 
PBI0144. pBIOl44 contains a modified wild-type 5' end of the bio operon with a unique Xho\ site replacing 
the Cfa\ site upstream of the promoter/operator region and a unique Xba\ site replacing the £co47lli site 

so immediately downstream of this region (Fig. 13A). The expected DNA sequence of PBI0144 from the Sa/I 
site to the SamHI site was confirmed. 

VIIBii- Engineered 3' bio cassettes 

3'cassettes were constructed in pBIOl26A, a low copy number plasmid with a Notl-flanked polylinker. 
To enable pBI0126A to be use? as a vector for integration and amplification, a PCR fragment of the cat 
gene from pHW9 [Hormouchi and Weisblum, J. Bacterid, 150, 815-825 (1982)] was introduced at the BstBI 
site to give plasmid pBlQi46A The BamHl to EcoRI fragments from pBlO20i (Example IC) and cBi0289 
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(Example VI) were cloned into the polylinker of pBIOi46A to give plasmids pBlCM5i and dBI0152 
respectively. The two plasmid vary in the amount of 3' flanking sequence accompanying the bio operon. 

vnc Regulation of engineered bio operons by a constitutive promoter 

Different constitutive promoters, such as those from the SP01 bacteriophage, e.g., SP01-26 cr SPOi-15 
[Lee and Pero. J Mol. Biol. 152, 247-265 (1981)], can be added in place of The bio promoter-operator -egion 
between the Xho\ and Xba\ sites After being integrated and amplified in the B. subtilis chromosome, this 
results in a vector capable of directing expression of the entire biotin operon from a constitutive promoter. 
This in turn leads to substantially improved biotin production. 

Table 6 



PCR generated fragments for 5' bio cassette construction 


PGR 


Upstream primer 


Downstream 


bp* 


Functional Elements 


Fragment 




primer 






B 


Lcadcrl 


ANEB1224 


733 


RBS, start codon, S'bioW 


C 


ORF4.1 


BIOL3 


140 


Upstream homology, terminator 


D 


BIOL4 


BIOL5 


95 


Promoter, operator, leader 


E 


ORF4.1 


BIOL5 


235 


Upstream homology, terminator promoter, operator, leader 



* fragment size in bp after digestion with restriction endonucleases. 



viiCi. Construction of a 5' cassette with the SP01-26 promoter 

A 5' cassette with the SPQ1-26 promoter reading into the biotin operon was constructed by replacing 
The Xho\ to Xba\ fragment from the engineered "wild-type" promoter with a PCR fragment containing the 
SP01-26 promoter The PCR fragment containing the SP01-26 promoter was generated from pNH20i a 
pUC8 subclone of the cloned SP01-26 promoter [Lee, G., Talkington, C. and Pero, J. Mot. Gen. Genet. J_80. 
57-65 (1980)]. The primers used (XH026A and XBA26B, Table 6A) introduced an Xho\ site at the upstream 
sioe of the promoter and an Xba\ site at the downstream side of the promoter. The Xho\-Xba\ digested 
PCR fragment was ligated with Xho\-Xba\ digested pBIOU4, and the ligated DIN A was transformed into E, 
coli YMC9. Plasmid minipreps from Spec r transfomnants were screened for acquisition of an FcoRV site 
that is located within the SP01-26 promoter region. Plasmids showing the expected £coRV site were then 
screened by PCR, using primers XH026A and BIOW1, to confirm the juxtaposition of the SP01-26 promoter 
and 5'£>/oW fragments. Two plasmids with the correct structure, pBlOl58 and pBIOl59, eacn had the 
expected sequence. 

VllCn: Construction of a S' cassette with the SP01- 75 promoter. 

A DNA fragment containing the SP01-15 promoter (Lee et al., supra) with appropriate ends for cloning 
in pBIOl44 was also generated by PCR. The primers (XH015B and XBA15C. Tabic 6A) were selected to 
generate a fragment which flanked the SP0M5 promoter with Xho\ (upstream) and Xba\ (downstream) 
sites. There is also a potential stem-loop structure near the beginning of the transcript from SP01-15. The 
downstream primer was also designed to extend the potential stem-loop at the new 5' end of the bio mRNA 
to include the expected +1 base of the transcript initiating by the SP01-15 promoter. The Xhol to Xba\ 
fragment containing SP01-15 was ligated into Xho\-Xbal digested pBIOi44 This ligated DNA was trans- 
formed into E. coli to make pBI0168 and pBlOl69. pBIOl68 and pBI0169 are identical isolates that contain 
a 5' bio cassette with the SP01-15 promoter. 
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Table 6A. PCR primers 



Mama 


DNA Secruonce of Primer J 


XH02 6A 


5'GGCCCTCGAG GCCTACCTAG CTTCCAAGAA3 ' 1 


XBA2 6B 


5 ' GGCCTCTAGA GCGTCCTGCT GTTGTTAAGA3 ' j 


BI0W1 


5 ' GCCAATCCAT TCTGGAGA3 ' 


XH015B 


S'GGCCCTCGAG GCTATTGACG ACAGCTATGG TT3 * 


XBA15C 


5 ' GGCCTCTAGA ACAGGCGGGG TTGCCCCCGC CTGTAATTAA 
ATTATTACAC A3 1 



VHCiii: Construction and integration of full length engineered "wild- type'' bio operons. 

It, 

Full length engineered biotin operons were constructed by introducing Sa/l to BamHI fragments from 
various engineered 5' bio cassettes, described above, between the Sa/l and BamH\ sites of pBIOl51 and 
pBIOi 52, which contain the 3' end of the bio operon and a selectable cat gene {Example VIII (b)(iii), 
above). Appropriate transformants were selected by complementation of the £ coli strain RY607 {Abio) to 

po Bio + For example, introduction of the Sa/l to HamHI fragment from the "wild-type" engineered 5' biotin 
cperon of pBl0t44 into pBIOlSI resulted in pBlOl55 containing the full length wild-type bio operon with 
the 3' flanking region, and a cat gene oriented in the same direction as the biotin operon. Ligation of the 
same 5' fragment From pBIOl44 into pBI0152 gave pBI0156 with the same features as pBIQl55 but 
lacking the 3' flanking region downstream of the bio operon. 

25 Plasmids pBIQi55 and pBI0156 were integrated into fl. subtilis with the entire plasmid or with the E. 
coli vector sequences delete.. ^ Table 7). S. subtilis strains PY79 ( BI421 and HB3 (both birA mutants; see 
Example X) were transformed with plasmids pBIOi55 and pBlOi56 and chloramphenicol resistant colonies 
were selected. Amplification of the integrated plasmid was achieved by streaking the strains on plates with 
increasing levels of chloramphenicol (Table 7, strains BI228. 230, 235, 237, 243 and 245). 

jo To construct strains which contained amplified copies of the wild-type biotin operon but no E. coii 
sequences, plasmids pBI0155 and pBIOl56 were digested with A/of I. The larger fragment from each digest 
was circularized and used to transform 0. subtilis strains PY79, BI421, and HB3 The cassette was 
amplified by streaking on plates with increasing levels of chloramphenicol (Table 7, strains BI232, 234, 239, 
241 247, and 249). 

35 SP01 promoter driven bio operon ligations were transformed directly into B. subtilis. Isolated Not I to 
SamHI fragments from 5' cassettes (5' flanking sequence, promoter region and 5' bioW) and isolated 
BamHI to A/of I fragments from 3' cassettes (3' bioW, bioA, bioF, bioD, biOB, bid, ORF2, terminator, 3' 
flanking sequence, and cat) were ligated under standard conditions and used to transform various B. 
subtilis strains The 5' A/of I to BamH\ cassettes were from pBlOl58 containing the SP01-26 promoter or 

40 pBIOl68 witn the SP01-15 promoter. The 3' cassettes were from either pBIOi 51 with the extended 3" 
flanking sequence or pBI0152 with tho truncated 3* flanking sequence (Tabic 7, strains BI267, 268, 274, 
276. 278, and 282). 

Each of these DNA ligations was transformed into a wild-type strain PY79 and in some cases also into a 
biotin-deregulated strain, BI421 (birA mutant, see Example XB below). Competent cells of the strains were 

45 prepared by standard methods and transformed with the different £>/o-containing DNA ligation mixtures 
described above, selecting for Cm r Since these DNA's cannot replicate in fl. subtilis, Cm f transformants 
arise by integration of the ligated DNA into the chromosome at the bio locus via recombination between 
homologous sequences present on the chromosome and the transforming DNA. In each experiment, 10-50 
Cm' transformants were selected for characterisation. Transformants were screened by PCR to confirm that 

so the SP01 promoter was juxtaposed to bioW. 
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Table 7 



First generation biotin production strains. 



Strain Sequences 


Parent 


Integrated DNA 


Promoter 


3'Flankir.g 


BI222 


PY79 


pBIO401caf s 


biotin 


yes 


BI224 


BI421 


pBIO401caf s 


biotin 


yes 


BI226 


HB3 


pBlO401caf s 


biotin 


yes 


BI228 


PY79 


PBI0155 


biotin 


yes 


BI230 


PY79 


PBI0156 


biotin 


no 


BI232 


PY79 


Not\ fragment PBI0155 


biotin 


yes 


BI234 


PY79 


Not\ fragment pBl0156 


biotin 


no 


BI235 


B1421 


PBI0155 


biotin 


yes 




BI421 




biotin 


nc 


BI239 


BI421 


Not\ fragment pBI0155 


btotin 


yes 


BI241 


BI421 


Not\ fragment pBIOl56 


biotin 


no 


BI243 


!^3 


pBI0155 


biotin 


yes 


BI245 


hB3 


PBI0156 


biotin 


no 


BI247 


HB3 


Not\ fragment pBI0155 


biotin 


yes 


BI249 


HB3 


Not\ fragment pBI0 156 


biotin 


no 


BI267 


PY79 


5'pBIOl58, 3'pBIOl51 


SP01-26 


yes 


BI268 


BI421 


5'pBI0158, 3'pBlOl51 


SP01-26 


yes 


BI274 


PY79 


5'pBIOl58, 3'pBIOl52 


SPOt-26 


no 


BI276 


BI421 


5'pBlOl58, 3'pBlOl52 


SP01-26 


no 


BI278 


PY79 


5' P BI0168, 3'pBIOl51 


SP01-15 


yes 


BI282 


PY79 


5'pBI0168, 3'pBI0152 


SP01-15 


no 



?o EXAMPLE VIII. Characterization of first generation B. subtilis biotin production strains. 

Southern blot experiments were used to confirm the structure of the integrated cassettes and to assess 
the degree of amplification of a representative subset of the engineered strains. From these experiments, it 
was clear that the presence of the SP01 promoters had a significant effect on the degree of amplification 
35 Engineered, full-length bio operons containing a wild-type bio promoter were amplified in strains grown on 
60 ug/ml chloramphenicol to levels similar to those seen for other operons under similar conditions 
(estimated 15 copies/cell). However, bio operons driven from an SP01-15 promoter shewed 2-foid less 
amplification and bio operons driven from the SP01-26 promoter were about four-fold less amplified. Thus, 
a subtilis cells have a limited tolerance for at least one of the products encoded by the bio operon 

40 

IIA. Assay for production of biotin and biotin precursors in test tube cultures. 

To determine the effect of mutiple copies of the wild-type bio operon or SP01-modified bio operons on 
biotin production, the wild-type and bictin-deregulated B. subtilis strains containing these engineered bio 
45 operons, integrated and amplified in their chromosomes, were tested for biotin production. The results are 
shown in Table 8 

All of the strains were grown overnight in 5 ml of VY medium at 37 *C, centrifuged, and the supernatant 
solutions autoclaved for 5 minutes to kill any remaining cells. (Biotin and desthiobiotin are stable to 
autociaving.) The supernatant solutions were diluted in biotin-free medium and inoculated with E.coh strains 
so RY604 (AbioH) and RY607 (AbioABFCD). RY604 and RY607 were constructed by transducing the relevant 
regions from BM7086 and a A{gal- uvn32i9 strain, respectively, (Cleary and Campbell, supra: Hat^e'd et 
a!., supra) into MM294. The former grows on both biotin and biotin vitamers, while the latter grows on biotin 
only The biotin and biotin vitamers produced by different B. subtilis mutant strains were calculated from a 
standard curve at OD & oc. 

-js The wild-type strain of B. subtilis, PY79, typically yielded about 6-10 ug/l of biotin in this assay (Table 
8}. Tne VY medium usee for these experiments had 20-45 ug/l biotin before con growth. Thus, most of the 
b;ot n contributed by the medium was consumed during growth by wild-type bacteria. 
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Several bictin analog-resistant mutants produced 50-100 ug/l biotin, 5-1 0 fold more biotin than found 
with the wild-type strain. Two biotin anatog-resistant "ains with birA-\\ke mutations were used. One mutant 
strain, HB3. contains a spontaneous homobiotin-resistant mutation. The other strain, BI421, contains an 
ethytmethylsulfate-generated a-dehydrobiotin-resistant mutation which had been crossed into an un- 
5 mutagenized background (see Example X). Both strains yielded 50-100 ug/l of biotin in these e<penments 
(Table* R and 9). 

Int- , ation and amplification of a "wild-type" copy of the bio operon in the wild-type strain PY79 
generally ,morovcd b. '»n production 10-50 fold over that seen with PY79 alone (Tabic 8) Such strains 
(BI230 and B1234) produced 150-600 ug/i compared to the 6-10 ug.'i produced by PY79 alone. However, 

w mote dramatic results were seen from the assays of the birA mutant strains with integrated and amplified 
copies of tne wild-type bio operon. An additional 5-10 fold improvement in biotin production was ooserved 
with yields up to 2,000 ug/l biotin, with this assay (see BI241. BI237, BI249; Table 8}. 

Analysis of wild-type fl. subtilis strains containing the engineered bio operons with an SP01 promoter 
resulted in an improvement in biotin production, with biotin titers generally 1000-2000 ug/l (see BI267and 

is BI274; Table 9) versus 150-600 ug/l with tho wild-type bio promoter. No dramatic difference was seen in 
biotin production between wild-type and birA mutant strains containing constitutive promoters (Taole 9). 

A second type of assay employs Lactobacillus plantarum as a biotin indicator (Wright et al.. Proc. 
Soc. Exp. Biol. Med. 56.95-98, 1944) and Saccharomyces cerevtsiae as an indicator of biotm vitamers 
(Baldet et a!., Eur. J. Biochem, 217:479-485, 1993). Assays were performed as described except that an 

po antibiotic was added to the assay cultures to reduce interference by contamination Since L. plantarum is 
sensitive to most antibiotics, a spontaneous streptomycin resistant mutant, L. plantarum str3, was selected 
and used for biotin assays in the presence of 50 ug/ml streptomycin sulfate. S. cerevisiae is naturally 
resistant tc most antibacterial compounds and was also used in the presence of 50 ug.'mi streptomycin 
sulfate. The L. plantarum growth response to biotin decreases more gradually tover a dilution range of 

25 about 50-foid) than the E. coii growth response. S. cerevisiae is more responsive to DAPA and KAPA than 
E coii. 

When cultures were assayed for biotin production with Lactobacillus as an indicator, more precise 
levels could be determined. Using these conditions, B. subtilis strains with the engineered SPOI-5/o 
operons yielded almost twice as much biotin as deregulated B. subtilis strains with amplified copies of the 
30 wild-type bio operon (Table 10). 

Table 8 



Biotin production by various B. subtilis strains containing integrated and amplified wild- type bio 

operons in test tube cultures. 


Strain name 


Relevant features 


Biotin (ug/1)" 


Biotin & Biotin 
Vitamers (ug .1)" 


PY79A 


prototroph 


10 


10 


PY79B 


prototroph 


6 


6 


HB3 


birA 


100 


100 


BI230 


PY79::[pBIOl56] G o 


150 


250 


BI234 


PY79;.[Not;l56]6o 


600 


1,200 


BI245 


HB3::[pBlOl56]] 60 


500 


2,000 


BI249 


HB3::[Not/156] b0 


1,500 


1,500 


BI237 


BI421::[pBIO156] 60 


1,000 


3,000 


BI241 


BI421::[Not/156]6 2 


2.000 


3,000 



* Assayed using E. coii RY607 (AbioABCDF) 
- Assayed using E. coii RY604 (AbioH) 
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Table 9 



Biotin production by various B. subtilis strains containing integrated and amplified 
SP01- engineered bio operons. 


Strain Name ! Relevant Features 


Biotin {ug/ir 


Biotin & Vitamers fug 1 )"* 


PY79A 


prototroph 


10 


10 


BI421 


birA 


50 


120 


BI421 


birA 


80 


150 


BI267 


PY79::[Not/158, 151]so 


1,000 


2,000 


BI268 


BI421::[Not/158.151]co 


1,300 


5,000 


BI274 


PY79::[Not/'158.152] 6 o 


1,500 


2,500 


BI276 


BI421::[NOV158 ( 152]60 


1,500 


3,500 



* Assayed using £ COfi RY607 (AbioABCDF) 
" Assayed using E, coli RY604 (AbioH) 



Table 10 



Biotin production by various B. subtilis strains. 


Strain name 


Relevant features 


Biotin (ug/l)' 


Biotin & Vitamers (uy I)" 


PY79 


prototroph 


6 


10 


BI421 


birA 


45 


200 


BI239 


amplified wild-type bio operon/6/rA 


580 


2700 


B1282 


amplified SP0M5-/WO operon/PY79 


1100 


3600 



* Assayed with L piantarum str3 
" Assayed with S. cerevisiae 



viiib Strain evaluation for large-scale biotin and biotin precursor production. 

Engineered biotin-producing strains can be evaluated for large-scale biotin and biotin precursor 
Droduction using fermentation technology. A range of fermenters and media conditions can be applied. As 
an example, all of the following fermentations were performed in computer controlled 1 4-liter Chemap 
fermenters utilizing a DO (dissolved oxygen) control, glucose-fimited. fed batch fermentation strategy. The 
amount of biotin or biotin precursor produced was determined by inoculating serial dilutions of autcclavcd 
-en-free t^roih with the appropriate strains of Lactobacillus piantarum or Saccharomyces cerevisiae as 
described above. The medium composition and other fermentation conditions are described in Table 11. 

Biotin and biotin precursors produced by various strains are listed in Table 12. In all fermentations, 1 g I 
pimelic acid was added to both the initial batch and feed solutions. BI282, BI278 and BI276 were the most 
optimized for biotin and biotin vitamer production 
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Table 11. Biotin 
medium 


fermaatitioa conditioai and 
(VY) composition. 


5 


Initial S&JLcJti 








" t UiTtfi 


T i ms c f 








Addir inn 


10 


A. Veal Infusion Broth 


150. 00 


4 . 5 liters 


Sterilized for 60 
mins . in 

fermenter 12.5 ml 
50% NaOH pH 6.8 
prior to 

sterilization, pH 
6 . 6 post 
steriiizat ion 




Yeast Extract 


30.00 








Sodium Glutamate 


30. 00 








KH2PO4 


45.00 








MgCl 2 « 6H 2 0 


9.00 








MnSO^f^O 


0 . 30 






?0 


FeCl 3 « 6H 2 02 


0.15 






(NH 4 ) 2 S0 4 


12.00 








MAZU DF37C 


15.00 




Approximately 800 
ml volume gain 


25 


B . Glucose 


150.00 


0.3 liters 


Added to 
fermenter 
immediately prior 
to inoculation 




CaCl 2 *2H 2 0 


6.00 








F*^d Solution 




C. KH2PO4 


54.70 


0.2 liters 


Added to D 


30 


MgSO^' 4H 2 0 


6.00 






D. Glucose 


3,000 


3.3 liters 


Combined with C 
and fed to 
fermenter 




inoculum Medium 


35 


E . Inoculum Medium 






Autoclaved 

separately 

Pre steriiizat ion 

pH adjusted to pH 

6. 8 


40 


Composition - "A" + 
0.35 gm CaCl 2 '2H 2 0 




30 0 ml 














F. 20% Maltose 




50 ml 


Added to E after 
coolinq 




G . 20% Glucose 




25 ml 


Added to E after 
cool inq 


45 












H. 3.5% H 2 SQ 4 




200 ml 


Usual requirement 
for dH control 




Bast 










I. Anhydrous NH3 






pH control | 



All solutions (A-GJ sterilized separately and combined when cocl. 
Conditions Air: 1.5-2.C vvm; RPM: 1000; pH 6.3; Temp. 37.0 *C; Pressure 

C . 6 bar. 
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Table 12 



Biotin and vitamer production by first generation B. subtilis strains in bench scale fermenters. 



Fermentation Run' 1 


Strain 


Promoter/strain 
background 


Biotir. (mg/liter) h 


Vitamers (mg liter}' 


B22 d 


BI239 


P b ,^BI421 (birA) 


1 


30 


B19 


BI268 


SP01-26/BI421 (birA) 


5 


40 


B23 


BI276 


SP01-26/BI421 {birA) 


8 


120 


B20 


BI278 


SP01-15/PY79 


10 


60 


B24 


BI282 


SP01-15/PY79 


8 


100 













a AI! fermentations used VY salts medium (described in Table 1 1) with 1 g'liter pimelic acid in both 
batch and feed. 

b Assaycd at 34 hours with L plantarum str3. 

c Assayed at 34 hours with 5. cerevisiae. 

d The ODboo of the culture was decreasing after 24 hours. 



Table 13 

25 



Effect of complex nitrogen/nutrient source concentration on biotin and vitamer production by strain BI282 

in bench scale fermenters. 


Fermentation 
Run 


Medium 3 


Biotin (mg1iter) b 


Vitamers (mg liter) c 


B30 


1 X BY 


10 


150 


B29 


2 X BY 


16 


200 


B31 


3 X BY 


8 


130 











a Fcrmcntation conditions as described in Tabic 5, except with 1 g/liter pimelic acid in batch and 
feed, and with beef extract and proteose peptone substituted for veal infusion broth. 
b Assayed at 30 hours with /. plantarum str3. 
c Assaved at 30 hours with $. cerevisiae. 



vine. Analysis of fermentation broths by bioautography. 

The spectrum of vitamers secreted by biotin-producing strains can be assayed by bioautography 
techniques. In the present case, fermentation broths were clarified by centrifugation and sterilized by 
autoclaving. One microliter aliquots of supernatant culture fluids were spotted on Baker-flex micrcciystailine 
je'iuiose thin-layer chromotography (TLC) plates and the compounds were separated with a solvent of n- 

50 outanol and 1N HCI (6:1 v v). After drying, the chromatograph was incubated for 1 hour, face down, on a 
ootin-free agar plate containing 2,3,5-triphenyltetrazolium chloride and kanamycin, and impregnated wi:h E. 
COli strain RY604 (AbioH)'pOK12 (Kan R ). The Kan R plasmid pOK12 [Viera and Messing, Gene 100, 189-194 
(1991)] was added to RY604 merely to provide antibiotic resistance so that contamination could be reduced 
in the assay. The TLC plate was then removed and the agar plate incubated at 37* C. After 20 hours, spots 

bb of growth corresponding to the location of biotin and vitamers on the „2 plates appeared. Fig. 15 shows 
biotin and vitamer standards with representative fermentation samples. The R ( values observed witn this 
chromatography system are indicated in Table 14, The technique can also be employed using paper 
chromatography instead of cellulose TLC (Table 14). Comparison to bioautography utilizing RY634 (AbioA- 
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D :Kan R , Bis ), which detects only biotin and biotin sulfoxide, indicated that substantial quantities of both 
aesthiobiotin and biotin were present in the fermentation broth. 

Addition of pimelate to the fermentation medium results in an increased level of a biotin vitamer which 
is probably KAPA (Fig. 15, lane D). This was shown by an increase in the KAPA/BSO spot compared to 

5 similar fermentations without pimelate {Fig. 15, lane B). Part of this material was demonstrated to be biotin 
sulfoxide, since it was also detected on bioautography utilizing RY634 {AbioA -D :Kan n , Bis + ) which detects 
only biotin and biotin sulfoxide. However, the intensity of the spot generated with RY634 at the KAPA BSC 
location was significantly less than that detected with RY604/pOKi2. 

The accumulation of desth>ooiotin and KAPA indicate limitations at the biosynthetic steps encoded oy 

jo bioB and bioA. Such limitations may be overcome by elevated expression of these individual genes or by 
increases in the pools of substrates, cofactors or cooperating proteins for these steps. The expression of 
bioB or bioA can be separately elevated by either inserting subclones of the individual genes or PCR 
copies of the genes in an expression vector with a strong promoter (SPOT veg, etc) and introducing the 
DNA into the cell on a plasmid such as pUB110, in a phage such as SP0, or integrated directly into a 

?b nonessential gene in the chromosome such as bpr. 

Table 14 



Observed and reported R f values for biotin vitamers on cellulose chromatography 


Biotin vitamer 


R. 


Observed, TLC 


Literature*, Paper 


DAPA 


0.09 


0.09 


KAPA 




0.35 


Biotin sulfoxide 


0.52 




Biotin 


086 


070 


Dethiobiotin 


0.94 


0.82 



a Agric. Biol Chem. 39, 779-784 (1975) 

30 - — 



VltID: Analysis of bio- specific mRNA synthesis in the engineered strains. 

3b Northern blot experiments were performed on selected strains to examine the transcription pattern of 
the bio operon and the amount of b/o-specific mRNA present in the various engineered strains. As 
expected a 7 kb RNA transcript covering the entire bio operon could be seen in all engineered stiains. 
Lesser amounts of this transcript were also present in wild-type and birA mutant B. subtilis strains. In 
addition all strains contained larger amounts (~8-fold) of a 5 kb transcript covering the first five genes m the 

40 bio operon, suggesting that a significant amount of transcript ended at a potential termination site after bioB 
i Fig. 5A>. Significant amounts of a small transcript of 0.8 kb that covered most of the bioW gene were also 
seen. This transcript ended near a sequence with similarity to the consensus sequence of a site implicated 
in catabohte repression [Chambliss, G.H., "Bacillus subtilis and Other Gram-Positive Bacteria" edited by 
Sonenshe-n et al., Am. Soc. Microbiology, pp. 213-219 (1993)]. The relative ratio of the three transcripts to 

45 each other was the same in wild-type strains grown in the absence of biotin, birA mutant strains, or 
engineered strains driven by either the wild-type bio promoter or an SP01 r remoter Only the absolute 
amount of total b/ospecific RNA varied dramatically in these strains. The engineered first generation 
production strains with a wild-type or an SP01-15 promoter produced about 30-fold or 60-fold, respectively, 
more O/ospecific RNA than a derepressed wild-type cell. The fiK>specific RNA levels m the birA mutant 

so were only slightly (2-3 fold) higher than RNA levels tn the derepressed wild-type cell, and were not affected 
by growth on biotin. 

It appeared from these experiments that the SP01 -promoters were directing the synthesis of at least 4- 
fcld more RNA per operon than the wild-type bio promoter However, with the reduced copy number of the 
SPOI-b/o operons, the total amount of bio specific RNA was at most only two-fold more than seen with a 
^ fully amplified, wild-type operon. The RNA levels correlated with biotin production levels, strains with the 
en g, neereC j i amplified SP01-£»o operons produced about two-fold more biotin than birA mutant strains with 
the amplified wild-type operons, thus confirming that increasing the expression of one or more of the bio 
genes wcuic 1 lead to increases in biotin titer 
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EXAMPLE IX. Genetic mapping of a blrA-IIke gene of a. subttlls 

In addition to cloned DNA that contains the B. subtilis bio operon. two recombinant plasmids, pBICn 13 
and pBIOH4, were recovered which contained B. subtilis chromoscmai DNA that complemented a 
r. temperature-sensitive mutation in the birA regulatory gene of E. coil. The birA gene product function n E. 
coli both as an enzyme that catalyzes the addition of biotin to aooenzymes and as a repressor pro*.:- that 
negatively regulates expression of the bio biosynthetic genes. 

The location of the 6/M-compicmcnting gone on the B. subtilis chromosome was mapped by PBS 1 
generalized transduction. To do this, a B. subtilis bacterial strain was generated that contained a selectable 
u antibiotic-resistance marker, cat, near the birA locus. Then, by determining the position of cat in the 
:ni'omosome. the £>/rA-complementing DNA was mapped. A derivative of pBI0113 was constructed thai 
contained a cat cassette (obtained from pMI1 101 [Youngman et al. Plasmid 12, 1-9 (19B4)], and this 
integration vector was introduced into the B. subtilis chromosome. Since the 7.0 kb cloned insert of 
pBIO1 13 is homologous to its corresponding segment in the B. subtilis chromosome, integration of the cat- 
is containing pBIOl 13 into the chromosome by Campbell recombination [Campbell, Adv. Genet JM. 101 -14b 
(1962)] introduced the cat gene near birA. Using standard PBS1 -transduction mapping, the birA-co*"- 
plementing DNA was mapped to the 202* region of the chromosome, very near the trp locus (> 90% 
linkage). 

po EXAMPLE X: Construction of a B. subtilis host strain deregulated for biotin production. 

XA. Construction of Biotin analog- resistant strains. 

Biotin analogs were used to select for strains that were deregulated for biotin production. Among the 
25 mutations sought were those in a potential homolog of the E. coli birA gene. However, it is expected that 
selection for resistance to biotin analogs can also yield strains with mutations in the operator site(s) of birA, 
or in genes encoding functions responsible for the transport of the analogs into the cell Analog-resistant 
mutants can also contain a gene or genes that encode enzymes resistant to inhibitors including, but not 
limited to. feedback inhibitors. Biotin analogs {homobiotin, a-dehydrobiotin, and 5(2-thienyl)pentanoiC acid) 
70 were obtained from Nippon Roche KK (Kanagsua, Japan). Mutagenized cells of B. subtilis were ptateo on 
TBAB (Difco Tryptose Blood Agar Base. cat. no. 0232-01-9) plates containing a crystal of each biotin 
analog 

B. subtilis PY79 (an SP£-cured prototroph derived from S.A. Zahler strain CU1769 {metB5,gtnAWG; 
Youngman ct al. 1984 supra) was mutagenized with ethyl methane sulfonate (EMS) to give 90% killing 
?5 Surviving cells were grown overnight and 01 ml of culture was plated on TBAB plates. A crystal of two of 
the bictin analogs was placed on the plate and incubated overnight at 37 * C. The a-dehydrobiotin and 5(2- 
tmenyl) pentanoic acid crystals inhibited the growth of B. subtilis and gave zones of clearing around the 
crystals. Within the clear zones individual colonies appeared, providing likely candidates for biotin-analog 
resistant strains. 

40 Several colonies were picked from the zones of clearing around the analogs a-dehydrobiotin and 5{2- 
thicnyl) pontanoic acid and named DB-1 to DB-4 inclusive if selected from an a-dchydrobiotin zone, and 
TP-1 to TP-3 inclusive if selected from a 5(2-thienyl) pentanoic acid zone. The isolated colonies were 
streaked onto minimal-casamino acid plates with various amounts of each analog. All of these strains gre/v 
better (i.e., produced a larger colony) than wild-type cells on their respective analogs. 

45 An additional 27 mutants were selected subsequently in similar plate screenings for their resistance to 
'he analogs homobiotin (HB) and a -dehydrobiotin (a-DB) For this latter selection, B. subtilis PY79 cultures 
were subjected to mutagenesis with EMS in two independent experiments. The first mutagenesis, EMS I , 
resulted in 96% killing whereas the second, EMS2, resulted in 82% killing of the bacteria. Overnight 
cultures of PY79, EMS1, and EMS2 grown in rich medium were plated on TBAB (rich), BIOS (biolin free) or 

'jq MUM iglucose-minimal) plates and a crystal of homobiotin or a-dehydrobiotin was placed on each of the 
oia:es. After 24 h, zones of killing were observed with a few resistant colonies growing within these zones. 
Individual colonies were picked from homobiotin plates or a-dehydrobiotin plates and restreaked on BIOS 
olates containing «-dehydrobiotin or homobiotin 

Potential repressor cr operator deficient mutants were screened for their ability to secrete a measurable 
level cf biotin. Each mutant strain was assayed for biotin and biotin vitamer production. Each strain 
grown in VY medium (5 ml, 20 g/l Difco veal infusion broth and 5 g ; l Difco yeast extract) for 18-24 nou-'s at 
37 *C, and the supernatant was sterile-filtered. The filtered supernatants were then serially diluted m a 
bbtm-free medium, and the serial dilutions were inoculated with E, coli strains RY604 (AbioH) anc 
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(AbioABFCD); the former grows on both biotin and biotin vitamers while the latter grows on biotin only The 
bioTin ana biotin vitamers produced by different B. subtilis mutant strains were calculated from standard 
curves generated with biotin and desthiobiotin. Six mutants from the collection produced about 100 ug L of 
secreted biotin. homobiotin resistant mutants HB3, HB9, and HB15, and a-dehydrobiotin resistant mutants 
5 Q-DB9, a-DB16, a-DB17. Other mutants produced either no biotin or 10 ug.L Other mutants selected by the 
above method can be expected to provide 75 ug'L. 150 ug-'L. or even 200 ug L, 250 j.g L or 300 ,ig L 

XB Mapping Biotin Analog Resistance Mutations 

w Example IX described mapping of the B, subtilis birA gene to a position just downstream cf trpC2. The 
six biot'n-secreting, anaicg-resistant mutants were examined for linkage to trpC2 by phage transduction to 
determine if they are located in a birA-\\ke repressor. Each candidate was crossed with B, subtilis 168 
\trpC2). and Trp* transductants were patched to minimal plates with or without homobiotin Results 
indicated that five of the six analog resistant mutations (HB3, HB9, HB15, a-DB16. and a-DB17) a'e closely 
linked to the birA locus (90%-95% co-transduction of Trp + and analog-resistance). BI421 is a homobiotin 
resistant (HB f ) Trp + transductaiU of strain 168 containing the birA mutation of a-DB16. 

The sixth analog resistant mutation, that contained in a-DB9, was not linked to trpC, and therefore s not 
a single mutation at birA. By transducing from a-DB9 into a bioW::cat7 strain in a similar tranduction 
napping experiment (see Figure 9), the mutation in a-DB9 was shown to be unlinked to the biotin operon. 

?0 Therefore the mutant phenotype of a-DB9 is either due to a mutation at a third locus distinct from birA and 
from bioWAFDBI, or it is due to mutations at more than one locus, all of whicn are required to e> press tne 
analog-resistant phenotype. The a-DB9 mutation can affect a biotin permease, a biotin export pumD, or an 
eivyme related to biotin biosynthesis. Analog-resistant mutations that are at different loci (such as those of 
HB3 and a-DB9) can be combined in a single strain by standard strain construction techniques to give a 

25 strain with even greater capacity for biotin sece'on. These analog-resistant mutants, or other mutants 
isolated and screened by the above procedures, may be used as host strains for biotin overproduction. 

Two additional biotin analog-resistant mutations earned by c*-DBt2, isolated for resistance to a- 
denydrobiotin as described above, and HB43, a spontaneous homobiotin-resistant mutant of PY79- 
ipBI0397caf), were also mapped to the birA locus. 

30 

XC: DNA sequence of B, subtilis birA mutants. 

Mutations resulting in amino acid changes can be found in the birA genes of homobiotin resistant 
strains such as HB3 and ^-dohydrobiotin resistant strains such as n-DBi6. To find such mutat.ons tho DNA 

35 sequence of a wild-type B. subtilis birA gene can be compared with the DNA sequence of B. subtilis birA 
genes containing biotin analog-resistant mutations. The wild-type B. subtilis birA gene sequence can be 
cbtamed by sequencing the cloned birA-h gene on pBIO1 13 or pBI0114. The mutant birA gene sequences 
can most easily be obtained from PCR copies of the gene. Several independent PCRs can be performed 
usmg genomic DNA from each birA mutant as template, and a pair of primers known to flank the birA 

40 ceding region. DNA fragments can then be isolated from each independent PCR and cloned in E. coli 
pUC2l [Vicra and Messing Gone 100, 189-194 (1991)]. Isolates from each of two independent PCRs can be 
sequenced on both strands using a series of internal primers. Any artifactual mutations introduced by PCR 
should appear in only one of the two independent PCR clones, while the "true" mutation should appear in 
both independent isolates. 

By comparison to the E. coli birA protein, for which the three dimensional structure is known [Wilson et 
al Prcc Natl Acad. Sci USA 89 ; 9257-9261 (1992)], the mutations can be characterized For example the 
mutation may be located in one of the DNA-contacting helices. This information can be used to construct 
improved birA mutant strains of B. subtilis with reduced capacity to regulate expression of the bio operon. 
For example, two of the sequenced mutations could be combined in one gene using well known methods of 
50 site directed mutagenesis. Alternatively, small deletions that remove the DNA binding portion of birA, but 
net the bictin ligase activity can be constructed. 

EXAMPLE XL Second generation of engineered B. subtilis biotin production strains 

s$ While constructing and characterizing the first generation of engineered B. subtilis biotin production 
strains, applicants observed several limiting steps in the biotin regulatory system, modification of wmcn can 
increase biotin production. For example, SP01 promoter-driven biotin operon s were not amplified to as high 
a :ooy number as wild-type biotin operons, as shown by Southern blot data First, identification of the ;ene 
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oroduct that is deleterious to the cell when overproduced, and deletion of this gene from the amplified 
cassette, can circumvent this problem. Second, there are two points of partial pre-matu^e termination of 
mRNA synthesis in the biotin operon. The following Examples Illustrate how this understanding of the biotin 
wild-type reguiatcry scheme can be used to optimize biotin production. 

XIA. Constructions to improve the copy number of SPQ1 promoter- driven bio cassettes. 

As shown above, biotin opcrons driven by SPOi promoters arc less amplified than a biot.n oocron 
controlled by the wild-type operon. Applicants reasoned that the product of one or more o* the bio genes is 
not tolerated at high levels. High level amplification can be achieved with a biotin operon lacking that 
specific bto gene. To determine which of these genes was not tolerated at high levels, Applicants designed 
the following experiment. 

First, to assure some constitutive level of expression of all the bio genes, the biotin promoter was 
^placed in the chromosome with an SP01-15 promoter. To do this, an upstream homologous sequence 
was added to a 5' cassette containing the SP01-15 promoter fpBIOl68). This construction was made by 
introduction of a 1 .8 kb PCR fragment, generated from the sequence just upstream from the 5* biolm 
cassettes, into the Sa/I to Nru\ gap of pBIOl68. The 1.8 kb PCR fragment was generated using primers 
designed to introduce a Nru\ site at the upstream, and a $ai\ site downstream, end. The resulting plasmid, 
PBIO180, has the SP01-15 promoter flanked by 1.8 kb of homologous sequence upstream of the bio 
operon and 0.7 kb downstream of the promoter This plasmid was used to transform A::CBh : (see Example 
III), which nas the promoter region of the biotin operon replaced by a cat cassette and is auxotrophic for 
biotin A double recombination event allowed the replacement of the cat cassette with the SP01-15 
promoter yielding the desired prototrophic strain, BI294 Cm*. 

A deletion in each of the bio genes can be generated by standard techniques. Below is one example ot 
how a nonpolar deletion mutation was constructed in bioW. 

A deletion of bioW was generated by altering the 5* cassette pBI0168. A PCR fragment was generated 
which has the SP01-15 promoter and first three codons of bioW followed by a SamHI site (Fig 14) This 
PCR fragment was engineered so that after replacing the Xho\ to BamH\ fragment of pBI0168, the resulting 
5' cassette, pBI0178, forms an in-frame bio W deletion upon ligation with the 3' bio cassettes (see Fig. 14). 
Transformation of this ligation mixture into 8. subtiiis BI294 (see above) and selection Cm r integrants 
(BI296) allowed amplification of the biotin operon without amplifying bioW. The chromosomal copy of bioW 
still transcribed from the SP01-15 promoter A control with a complete SP01-15 driven bctm ooeron 
:ntegrated into BI294 was also constructed and called BI295. Copy number of the operon was reduced in 
both cases, compared to amplified BI247 (Example VllCiii) Comparison of the amplification of theso two 
strains suggested that bioWis not the gene whose product is deleterious when overproduced 

This procedure can be repeated with each bio gene in turn to identify the gene that is deleterious when 
overproduced. 

Analysis of biotin production by BI296, lacking the amplified bio\N gene, compared to the isogenic 
strains BI295, containing an amplified bioW gene, indicated that the product of bio\N is not the rate-limiting 
enzyme for biotin biosynthesis (Table 15A). BI296 produced about 10 times more biotin than the parent 
strains BI294 without tho amplified bio cassette. Furthermore, BI296 produced similar amounts of biotin as 
BI295, the isogenic control strain with the complete SP01-6/O operon. Repeating such experiments with 
internal, nonpolar deletions in each bio gene will identify the rate-limiting gene for biotin biosynthesis in B. 
subtiiis. 
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Table 15A 



Biotin production by various B. subtitis strains 


Strain name 


Relevant features 


Biotin (ii.g/1)" 


Biotin & Vitamers (ug/1)" 


PY79 


prototroph 


4 


10 


PY79 


prototroph 


6 


10 


DBI6 


birA 


46 


1 40 


BI294 


SP01-&/O/PY79 


165 


360 


BI294 


SP01-&/Q/PY79 


125 


450 


BI295A 


amplified SP0 1-6/0 cperon/BI294 


1116 


3150 


BI295B 


amplified SP01 -bio operon/BI294 


1405 


5000 


BI296A 


AJb/oW amplified SP01-6/O operon/BI294 


1402 


4000 


BI296B 


AO/oW amplified SP01-0/O operon/BI294 


1574 


3950 



'Assayed with Lpiantarum str3- 
"Assayed with S. cerevislae 



XIB. Removal of possible transcription termination sites. 

There are two internal sites of termination within the biotin operon. An mRNA fragment of aoout 0.8 kb 
25 is observed which corresponds to the distance from the promoter to a region in bioW which shares 
homology with the consensus sequence for a ft subtilis catabolite repression sequence (CRS). The major 
biotin transcript seer, in Northerns is 5.2 kb. This corresponds to the distance between the promoter and the 
bioB-biol junction where a stem-loop structure is followed by a string of T residues Constructions we^e 
made to eliminate the CRS and to increase the level of transcription past the bioB-bioi junction to the end 
of tho operon. 

XIBi: Removal of the catabolite repression sequence, 

In ft subtilis, sporulation and the synthesis of certain enzymes are subjected to catabolite repression. 

3b [Chambliss, G H , in "Bacillus subtilis and Other Gram-Positive Bacteria" Sonenshein et al , eds Amer 
Soc. Microciology, Washington, D.C. pp. 213-219 0993)]. 

Two potential catabolite repression sites (CRS) arc located in or around the bio operon. One .s located 
within the putative S'leaae^ region of ORF3. The second catabolite repression-like sequence was located 
within the 3' end of bioW. The location of this sequence coincides with the 3' end of a 0.8 kb -specific 

40 transcript detected in Northern blots, suggesting that catabolite repression might control, in part, expression 
of the bio operon There is also a short AbrB regulatory sequence within this catabolite repression-like 
sequence [Stauch, M A , in "Bacillus subtilis and Other Gram-Positive Bacteria", Sonenshein et al eds 
Amer. Soc. Microbiology, Washington, D.C. pp. 757-764 (1993)]. 

The portion of bioW encoding the SamHI site and the CRS is illustrated in Fig. 17A. The CRS starts 1 1 

45 bp downstream from the BamH\ site. Four codons in the sequence comprising the CRS can be converlea to 
alternative codons by changes in the third position without altering the amino acid sequence. The thud 
position changes alter three of the four most highly conserved residues (underlined) of the CRS {Fig. 17A). 

As shown in Fig. 17A, the CRS site in bioW also has significant homology to an AbrB consensus 
bmdmg site A concern when altering the sequence of a CRS is that the binding site for AbrB is similar in 

b0 sequence and care must be taken not to generate a strong AbrB binding site when destroying the CRS. 
However, the alterations introduced to destroy the CRS also reduce homology to the AbiB site. To 
introduce the changes indicated in Fig. 17A, a PCR primer was designed to mctuce the BamH\ site, the 
CRS region with the desired mutations, and twenty residues for priming. An appropriate downstream primer 
allowed generation of a 660 bp fragment which could be digested with BamH\ and Ssfll07l Both BamH\ 

Db and fisM 1071 restriction enzymes have unique sites in the plasmid pBI0289. The SamHI and BsM107i cut 
PCR product was cloned into BamH\ and fisfii 071 cut pBI0289 to yield plasmid pBIOi79 The fiamHI to 
EcoRl fragment from pBl0 1 79 was then cloned into pBI0146A to generate a new 3' cassette piasmij. 
PBI0183. To change the sequence of the chromosomal copy of bioW. a two step protocol was utilized. Fi;st 
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a cat gene was introduced at the BamH\ site in bioW (see Example III) of BI294 (Example XlA) This 
generated an auxotroph BI294::ca?7. Transformation of the auxotroph with linearized pBlOI79 and selection 
for Bio + yielded strain BI297 which has a single chromosomal copy of the biotin operon driven by the SP01- 
15 promoter with the sequence of bioW altered to destroy the catabolite repression sequence. The use of 
e the new 3' cassette, pBlOt83 with a 5'cassette i.e., pBIOl68 to integrate and amplify in BI297 will assure 
amplification of a modified bioW and may relieve premature termination due to catabolite repression This 
strain is BI306. With this procedure, second generation production strains are generated which might be 
lc?s sensitive to catabolite repression. 

w XIBii. Removal or bypass of the termination site after bioB. 

Two strategies were adopted to increase expression of biottn genes that lie downstream from internal 
sites of transcript termination. The first strategy involves deletion of the terminator The second strategy is 
to insert an SP01-15 promoter in front of bioi, in order to provide strong transcription of biof and ORF2. 

To delete the terminator (Fig. 17B), which is in an intercistronic region, appropriate PCR primers were 
designed. One prim, hybridized to bioB, upstream from a unique B$pE\ site. The second primer 
complemented the Pmi\ site, the ribosome binding site of bio\, skipped 51 bp and then complemented the 
stop codon and 23 bp at the 3* end of bioB (Fig. 17B). Digestion with espEI and Pml\ generated a 209 bp 
fragment which was used to replace the BspEI to Pmi\ fragment of pBl0289 to give pBlOl 81 This plasm id 

90 was used to generate a new 3' cassette by cloning the SamHI to Eco&l fragment into pBIOU6A to yield 
pBlOl85. Alteration of the chromosomal biotin operon in BI294 to delete the terminator between bioB ana 
bioi was accomplished in two steps. First, using a strategy similar to those described in Example III, a cat 
gene was introduced into the end of bioB. This yielded the Bio", Cm r strain BI300. When BI300 was 
transformed by linearized plasmid pBIOl81, Bio + isolates contained the desired terminator deletion and are 

25 represented by BI303. Integrated and amplified biotin operons containing the terminator deletion were 
constructed by transforming BI303 with ligated Bamh\ to Not\ fragments from pBI0168 and pBl0185 and 
are represented by BI307 

To assure maximum expression of bioi and ORF2, an SP01-15 promoter was introduced in front of bid 
The SP01-15 promoter from pBIOl68 was amplified by the PCR, introducing the ribosome binding site and 

30 start codon/Pm/l site of bioi on the downstream side, and a $tu\ site on the upstream side. Primers used 
were: 1) 5'-GGC CAT TCT ACA CGT GAT TTT CTC CTT TCT GTC TAG AAC AGG CGG GGT TGC: and 
2) 5'-GGC CAG GCC TGG CTA TTG ACG ACA GCT ATG GTT Since digestion of DNA by Stu\ and Pmi\ 
creates blunt ends, digestion of pBI0289 with Pml\ allowed introduction of the Stu\/Pml\ digested PCR 
fragment In one orientation the Pmi\ site is regenerated at the bioi start codon and the SP01 15 promoter 

?s directs transcription of bioi and ORF2. The piasmid with this orientation was called pBlOl82. A new 3' 
cassette (pBIOl84) was constructed by cloning the BamH\ to EcoR\ fragment f pBIOl82 into pBI0146A. 
This construction is expected to generate even more transcription of bio\ d ORF2 tnan wou<d be 
generated by elimination of termination between bioB and bioi. 

The SP01-15 driven biof construction was introduced <<-o the chromosomal copy of SI294 by 
transduction of BI300 {see above) with linearized pBlOl82 and selection for Bio* yielding BI304. Integration 
and amplification gave BI308. 

XIBii i : Biotin production by single copy terminator modified strains . 

45 Biotin and vitamers were assayed from test tube cultures as described in Example VINA utilizing 
Lactobacillus and Saccaromyces. BI294 was used as a control for BI303 which deleted the terminator 
oetween bioB and bioi and for BI304 which introduced a SP01-15 promoter before bioi. As demonstrated in 
Table 15B, deletion of the terminator or introduction of the SP01-15 promoter before bioi have little effect 
on biotin titers but dramatically increase thy production of biotin vitamers. 

50 
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Table 15B 



Biotin and Vitamer Assays of Terminator Modified Strains. 



Strain 


biotin locus 


Biotin/ig/1 


Vitamer ug/1 


BI294C 


SP01-15WO 


126 


481 


BI294D 




315 


528 


BI303A 


SP01-156/OAT 


313 


838 


BI303B 




200 


790 


BI304A 


SP01 -15&/0 


182 


2800 


BI304B 


SP01-15b/O/ 


179 


3138 



EXAMPLE X!C: Altering bio gene ribosome binding sites. 

Translation of genes in the bio operon can be improved by altering the ribosome-binding sites to 
^° conform more closely to a canonical B. subtifis ribosome binding site with the sequence 5'AGAAAGGAGG- 
TGA3'. Such changes can be introduced by synthesis of a DNA primer encoding an appropriate restriction 
site, the modified ribosome-binding site, and sufficient downstream DNA to nsure priming of a PGR 
reaction By selection of an appropriate second primer, one skilled in the art can synthesize a PGR product 
containing the modified ribosome-binding site. This PCR fragment containing the altered ribosome-binding 
35 site can then be introduced into an engineered bio operon by the same methodology used to introduce the 
modified CRS sequence described in Example XIBi. 

EXAMPLE XII: Azelaic acid-resistant (Azl r ) mutants of B. subtilis. 

30 Azelaic acid, a straight chain C 9 dicarboxylic acid, is a homolog of pimclic acid and is thought to be an 
intermediate in the conversion of oleic acid to pimelic acid [see Ohsugi and Inoue Agric. Biol. Chem. 45, 
2355-2356 (1981)]. Pimelic acid at 1 g/1 can stimulate biotin vitamer production in B, subtifis and pimelic 
acid at 30 mg/l can restore wild-type growth to a PY79 biol::cah bradytroph (see Example III). Azelaic acid 
at 30 mg/l does not substitute for pimelic acid in supporting growth of PY79 biolrcah. In fact, azelaic acid 

Jb at 30 mg/l severely inhibited the growth of PY79 biol::cah Azelaic acid at higher concentrations inhibits the 
growth of wild-type B. subtiiis, this inhibition being reversed by addition of 1 ug/l biotin. From these results, 
Aoplicants reasoned that azelaic acid is a specific inhibitor of biotin biosynthesis in B. subtifis. 

A wild-type E. cofi strain, MM294, is relatively resistant to azelaic acid. The E. cofi strain RY604- 
(AP/oH), containing pBIO403 which includes only the bioW gene from B. subtiiis, is auxotrophic for biotin, 

40 although 30 mg'i pimelic acid can satisfy the biotin requirement. However, RY604/pBIO403 grown in the 
presence of excess pimelic acid (80 mg/l) is sensitive to inhibition by azelaic acid. Therefore, Applicants 
concluded that azelaic acid acts at the level of pimelyl CoA synthetase (bioW), either as a competitive 
inhibitor of pimelic acid, or by incorporation into a biotin homotog or other toxic intermediate. 

45 XIIA: Isolation of azelaic acid resistant mutants of B. subtilis. 

On minimal agar, azelaic acid at 2 g/l (about 10~ 2 M) severely inhibited growth of PY79, although it did 
not kill or completely prevent growth. Seven spontaneous mutants that outgrew the background of PY79 on 
2 ryl Azelaic acid were isolated. The resistance to azelaic acid appeared to be a stable trait in all but one 

50 

case (see oelow). The seven mutants were provisionally named PA1 - PA7 (PY79 Azelaic acid resistant). 
PA1 through PA7 were grown in test tubes in VY medium, and biotin production was assayed using E. cofi 
indicator strains (Table 15). The mutants fell into two classes, those that yielded more biotin than PY79 
(PA5. PA6) and those that were similar to PY79 (PA1, 2, 3, and 7). PA4 appeared to be either unstable or 
net a true mutant and was drepped from further study. Applicants also noticed that the mutants fell into two 
sb classes with respect to colony size on minimal agar with 2 g/l azelaic acid, and that these two classes 
corresponded to the two classes of biotin producers (Table 11) The mutants with the most biotin in the 
supernatant (PA5 and PA6) gave small colonies, while the biotin non-secreters. PA1, 2 t 3. and 7, gave 'arge 
colonies. PA1 and PA3 secreted a compound that cross-fed PY79 (i.e., reversed the azelaic ac;d inhibition 
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of PY79) cn minimal plates containing 2 gl azelaic acid 

Representatives of the two classes of azelaic acid resistant mutants. PA3 and PA6, were chosen for 
'jrlher characterization. In liquid minimal cultures containing serially diluted azelaic acid. PA3 and PA6 
showed clear resistance to azelaic acia compared to the parent. PY79 (Fig. 18). However, me dcse 
response curves of PA3 and PAG were distinct from each other. PA3 showed greater resistance than PAG, 
and at lower concentrations of azelaic acid PA6 did not grow to the same cell density as PA3 or PY79 

XMB Mapping of azelaic acid resistant mutants. 

To map the azelaic acid resistant mutations in PA3 and PA6. Applicants determined whether either 
mutation maps at birA cr at the biotin opercn. In the first case, PBS1 transducing lysates from PA3 and 
PAG were applied to strain RL1 (trpC2), and Trp* transductants were selected The trpC2 and birA loc: are 
about 90% linked by transduction The Trp^transductants were then screened for azelaic add resistan:e by 
patching to minimal agar containing 2 g/l azelaic acid. No Trp + transductants were azelaic acid resistant, 
demonstrating that neither mutation is linked to trpC2, and therefore neither is a birA mutation The PA3 
mutation showed strong linkage to bio in two transductions (one into PY79 P 0ja .:caM7 ana one "no PV79 
bioWv.catl), while the PAG mutation did not. 

XMC: Effect of azelaic acid resistance mutations on biotin production. 

Another approach to altering biotin regulation is to combine either of the azateic acid resistant mutations 
with a birA mutation. 

The birA mutation from either HB3 or a-DB10 was introduced into either PA3 or PA6 by a ;wo step 
transduction process. First, PA3 and PA6 were made Trp" by transduction with trpE::Tn917fac [Perkins 
and Youngman, Proo Natl. Acad. Sci. USA 83, 140-144 (1986)] selecting for erythromycin resistance. The 
birA mutations were transduced from HB3 or a-DB16, selecting for Trp'. Parents and Trp + transductants 
were screened for homobiotm resistance and azelaic acid resistance PA6 was homobiotin sensitive so 
double mutants could be identified among Trp + transductants by screening for homobiotm resistant 
colonies, Only about 60% of Trp + transductants were also homobiotin resistant (this may be due to Tn917 
distortion of the map distance between birA and trpE normally 90% linkage by transduction). PA3 was 
resistant to homobiotin, so direct identification of double mutants was not possible. However, GO% of the 
transductants were double mutants as judged by increased biotin secretion, see below 

The parent strains, putative PA3 birA double mutants, and actual PA6 birA mutants, were tested for 
b.ot.n and vitamer production in VY test tube cultures As shewn in Table 18, the double mutants derived 
from PA3 produced about four to six fold more vitamer and twice as much biotin as the birA parent. Double 
mutants derived from PA6 produced similar or only slightly more biotin and vitamer than the birA parent. 
Clearly, the PA3 mutation aids biotin production in a deregulated strain. 

xiid- Additional azelaic acid resistant mutants. 

In addition to the azelaic acid resistant mutants of the types represented by PA3 and PAG, several other 
a^eiaic acid resistant mutants that represent at least two new classes have been isolated as spontaneous 
mutants from the PY79 strain background. 

Eleven additional mutants were partially mapped by transduction as described above to determine if the 
azelaic acid resistant mutation was linked to birA or to the bio operon. None were linked to trpC2, so by 
in f 6renC(5 none were a t the birA locus On the other hand, eight of the eleven tested were linked to the bio 
operon. Of those eight, two were tightly linked to the bio promoter, as was PA3, while six were substantially 
less than 100% linked to the bio promoter, suggesting that the mutations were in the bio operon well 
downstream from the promoter. Thus this latter group of six mutants represents another class of azeiaic 
acid resistant mutants distinct from PA3 and PAG. This group includes BI514. BI521, BI532, BI535, BI537 
and BI545. This group of mutants is likely to include mutants that have increased capacity to produce or 
utilize pimelic acid, for example bid or bioW mutants. 

Three of the new mutants did not map at birA nor at the bio operon This group includes BI523, BI544, 
and BI549, and is likely to contain mutants that produce increased levels of pimelic acid precursors or that 
are more efficient at converting various biotin precursors into bictin. None of this group were equivalent to 
PA6. sm;e unlike PA6, they all grew to the same dens'ty as PY79 (w>ia type) «n a minimal medium Peking 
azelaic acid. 
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The new mutants are summarized in Table 19 Although none lead to significantly increased bintin 
production by themselves, these mutations are likely to increase biotm production when combined with 
ether biotin deregulating mutations as was the case for PA3. 

Applicants have deposited on May 4, 1994 under the terms and conditions of the Budapest Treaty 
strains PA3, HB43, HB3, BI544, BI535, BI421, BI304, BI282 and BI274 with the American Type Culture 
Collection fATCC) in Rockville, Maryland, USA and they have received accession numbers ATCC 55567. 
55568. 55569, 55570, 55571, 55572, 55573, 55574 and 55575 respectively. 

Table 1 7 



Biotin and vitamer production by azclaic acid resistant mutants in test tube cultures. 


Strain 


Colony size on 2 g/l azelaic acid 


Biotin (M-gT) 


Vitamers (ug I) 


PY79 


tiny 


10 


11 


PA1 


large 


10 


13 


PA2 


large 


10 


12 


PA3 


large 


10 


12 


PA4 


tiny 


10 


1 1 


PA7 


large 


10 


12 


PA5 


small 


30 


60 


PA6 


small 


30 


60 


VY Medium (no cells) 




30 


100 
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Table 18 



po 



Biotin and vitamer production by birAazf double mutants. 


Parent strain 


Donor 'or birA gene 


Isolate number 


Homobiotin 
resistant/sensitive 


biotin (ug i) 


Vitamers' 1 J.g \ \ 


VY medium 









22 


25 


PY79 






S 


6 


10 


PA3 






R 


5 


9 


PA3 


HB3 


1 


R 


120 


690 


PA3 


HB3 


2 


R 


100 


440 


PA3 


3-DB16 


1 


R 


100 


690 


PA6 






S 


1 1 


16 


PA6 


HB3 


2 


R 


46 


200 


PA6 


HB3 


6 


R 


50 


120 


PA6 


a-DB16 


8 


R 


53 


220 


PA6 


a-DB16 


14 


R 


56 


200 


HB3 






R 


44 


110 


ot-DB 1 6 






R 


46 


140 



a Assayed using L. plantarum strS 
b. Assayed using S. cerevisiae 
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Table 19 



Additional Azelaic Acid Resistant Mutants 


Strain Name 


LmKayc IO "hto 


DiDoU 


1 fin 


DIC11 

blbJd 


1 nn 


D|r< i 

Blbi 4 


~7C\ 


BI521 


yu 


BI532 


80 


blbob 




BI537 


40 


BI545 


90 


BI523 


0 


BI544 


0 


BI549 


0 


PA3 


100 


PA6 


0 


PY79 (wild type) 





'^Approximate linkage in percent BIO + , azelaic acid resistant upon PBS1 
transduction into PY79 9^:CatM 
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35 



SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: F. HOFFMANN- LA ROCHE AG 

(B) STREET: Grenzacher a t rasse 124 

(C) CITY: Basle 

(D) STATE: BS 

(E) COUNTRY: Switzerland 

(F) POSTAL CODE (ZIP) : CH-4002 

(G) TELEPHONE: 061 - 688 25 05 

(H) TELEFAX ; 061 - 688 13 95 

(I) TELEX: 9622 92/965542 hlr ch 

(ii) TITLE OF INVENTION : BIOTIN BIOSYNTHESIS IN BACILLUS SUBTILIS 
(iii) NUMBER OF SEQUENCES: 17 

'.V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk: 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: System 7.1 (Mac) 

(D) SOFTWARE: Microsoft Word 5.0 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 03/084,709 

(B) FILING DATE: 2 5- JUNE-1 9 93 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8478 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 



CGCATCGGAG ATCCAAAGCC TGATCGCGCC GCGCCCGCAC CTTAGTCTTG TTGGTGTACA 60 

CGATCGGTTA ACGCCGGCTG AGGGCGTGGA CAAAATCGAA AAAGAATTGA CAGCTGTCTA 12 0 

TGCTGGACAG GGAGCTGCTG ATTGCTACCG AGTGGTCCGT TCTGCTTCGG GACATTTCGA 180 

AACAGCAGTT ATAAGGCATG AAGCTGTCCG GTTTTTGCAA AAGTCGCTGT GACTGTAAAA 2 40 

AGAAATCGAA AAAGACCGTT TTGTGTGAAA ACGGTCTTTT TGTTTCCTTT TAACCAACTG 300 

CCATAAATCG ATCCTTTCTT CTATTGACAG AAACAGGAGA GAATAATATA TTCTAATTGT 3 60 

TAACCTTTGA ATATAATTGG TTAACAATTT AGGTGAGAAG CGCTACACGT TCTTCAGTTA 42 0 

TCAGTGAAAG GCCGAGAAAT GATGCAAGAA GAAACTTTTT ATAGTGTCAG AATGAGGGCT 48 0 

TCAATGAATG GATCTCATGA AGACGGCGGA AAGCATATAT CCGGCGGAGA ACGGCTTATT 5 40 

CCTTTCCATG AGATGAAGCA TACAGTCAAT GCTTTATTAG AAAAAGGGTT ATCCCATTCA 600 
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AGAGGAAAAC 


CTGATTTTAT 


GCAAATTCAA 


TTTCAAGAGG 


TACATGAATC 


GATAAAAACC 


660 




ATTCAGCCAT 


TGCCTGTGCA 


TACGAATGAA 


GTGAGCTGCC 


CGGAAGAAGG 


ACAAAAGCTT 


720 




GCCCGATTGT 


TATTGGAAAA 


AGAAGGCGTT 


TCACGA3ACG 


TGATTGAAAA 


AGC ATATGAA 


780 




CAAATCCCTG 


AATGGTCAGA 


TGTCAGGGG7 


GCGGTGTTGT 


TTGATATT3A 


TACA3GCAAG 


840 




CGAAT'J . ATC 


AAACAAAAGA 


AAAAGGGGTG 


CCGGTCTCCA 


GAATGGATTG 


GCCG3ACGCT 


900 




AATTTTGAAA 


AATGGGCGCT 


TCACAGTCAC 


GTGCCAGCTC 


ATTCAAGAAT 


AAAAGAGGCC 


960 




CTTGC3CTCG 


CTTCAAAGGT 


AAGCCGGCAC 


CCGGCAGTCG 


TTGCAGAATT 


ATG3TGGTC3 


1C20 




GACGATCCGG 


ATTACATAAC 


AGGCT ATGT7 


GCGGGTAAGA 


AAATGGGCTA 


TCAGGGTATT 


i : b o 


7b 


ACAGCAATGA 


AAGAATACGG 


GACTGAAGAG 


GGCTGCCGAG 


7CTTTTTTAT 


TGATGGATCC 


1140 




AATGATGTAA 


ACACCTACAT 


ACATGACCTG 


GAGAAGCAGC 


CTATTTTAAT 


AGAGTGGGAG 


1200 




GAA3ATCATG 


ACTCATGATT 


TGATAGAAAA 


AAGTAAAAAG 


CACCTCTGGC 


TGCCATTTAC 


1260 


?Q 


CCAAATGAAA 


GATTATGATG 


AAAACCCCTT 


AATCATCGAA 


AGCGGGACTG 


GAATCAAAGT 


1320 


CAAAGACATA 


AACGGCAAGG 


AATACTATGA 


CGGTTTTTCA 


TCGGTTTGGC 


TTAATGTCCA 


1380 




CGGACACCGC 


AAAAAAGAAC 


TAGATGACGC 


CATAAAAAAA 


CAGCTCGGAA 


AAATTGCGCA 


1440 




CTCCACGTTA 


TTGGGCATGA 


CCAATGTTCC 


AGCAACCCAG 


CTTGCCGAAA 


CATTAATC3A 


1500 


25 


CATCAGCCCA 


AAAAAGCTCA 


CGCGGGTCTT 


TTATTCAGAC 


AGCGGCGCAG 


AGGCGATGGA 


1560 




AATAGCCCTA 


AAAATGGCGT 


TTCAGTATTG 


GAAGAACATC 


GGGAAGCCCG 


AGAAACAAAA 


1620 




ATTCATCGCA 


ATGAAAAACG 


GGTATCACGG 


TGATACGATT 


GGCGCCGTCA 


GTGTCGGTTC 


1680 


30 


AATTGAGCTT 


TTTCACCACG 


TATACGGCCC 


GTTGATGTTC 


GAGAGTTACA 


AGGCCCCGAT 


1740 




TCCTTATGTG 


TATCGTTCTG 


AAAGCGGTGA 


TCCTGATGAG 


TGCCGTGATC 


AGTGCCTGCG 


1800 




AGAGCTTGCA 


CAGCTGCTTG 


AGGAACATCA 


TGAGGAAATT 


GCCGCGCTTT 


CCATTGAATC 


1860 


35 


AATGGTACAA 


GGCGCGTCCG 


GTATGATCGT 


GATGCCGGAA 


GGATATTTGG 


CAGGCGTGCG 


192 0 




CGAGCTATGT 


ACAACA7ACG 


ATGTCTTAAT 


GATCGTTGAT 


GAAGTCGCTA 


CAGGCTTTGG 


19B0 




CCGTACAGGA 


AAAATGTTTG 


CGTGCGAGCA 


CGAGAATGTC 


CAGCCTGATC 


TGATGGCTGC 


2 0 40 


40 


CGGTAAAGGC 


ATTACAGGAG 


GCTATTTGCC 


AATTGCCGTT 


ACGTTTGCCA 


CTGAAGACAT 


2100 




v.4Al i\J\Kj o ^ n 






CCtaaaaaCC 


TTTTTCCATG 


GCCATTCCTA 


2 1 6 C 




TACAGGCAAT 


CAGCTTGGCT 


GTGCGGTTGC 


GCTTGAAAAT 


CTGGCATTAT 


TTGAATCTGA 


222 C 


4b 


AAACATTGTG 


GAACAAGTAG 


CGGAAAAAAG 


TAAAAAGCTC 


CATTTTCTTC 


TTCAAGATCT 


2280 




GCACGCTCTT 


CCTCATGTTG 


GGGATATTCG 


GCAGCTTGGC 


TTTATGTGCG 


GTGCAGAGCT 


2340 




TGTACGATCA 


AAGGAAACTA 


AAGAACCTTA 


CCCGGCTGAT 


CGGCGGATTG 


GATACAAAGT 


2400 




TTCCTTAAAA 


ATGAGAGAGT 


TAGGAATGCT 


GACAAGACCG 


CTTGGGGACG 


TGATTGCATT 


24 60 




TCTTCCTCCT 


CTTGCCAGCA 


CAGCTGAAGA 


GCTCTCGGAA 


ATGGTTGCCA 


TTATGAAACA 


2 5 2 2 
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AGCGATCCAC GAGGTTACGA GCCTTGAAGA 
GAATGAAAGA AGCCGGCGTA CATCGTAACC 
AGAGGAATAT TGATGGCGAA AATCAAACGG 
CAAGC3ATAG ACGTTTGATC GATGCAGCCC 
GCAGCGGTTC ACGTTTAACG ACAGGCAATT 
TTGCCAGCTT TAAACTGACA GAAGCGGCCC 
TCGGTGTCCT TTCATCCTTG CCAGAAAAGG 
ATGCAAGTAT GATCGACGGC TGCCGACTTT 
TTGATATGAA TGATCTTGAA AACAAGCTGA 

7b 

TCGTAACAGA CGGAGTATTC AGCATGGATG 
CACTTGCGAA ACGCTATCAT GCCTTCGTGG 
TGGGCGATTC GGGACAAGGA ACGAGTGAAT 
GCACCTTAAG CAAAGCTGTT GGCGCGGAAG 
TCGACTTTTT GCTGAACCAT GCCAGAACAT 
GCTGTGCGGC TGCTCACGAG GCTTT'JAACA 

25 

TTTTATTTTC TTATATCAGC ATGATCAGAA 
AAGGAGATCA CACACCGATT ATTCCTGTAG 
TTGCTGAAAA ACTGCAGGGC AAGGGAATTT 

30 

CGCCGGGTGA AAGCCGGATT CGAATTACAA 
ATCATTTGCT GCAAAC ATT T CATTCAATCG 
TTTTGTGACG GGAACTGATA CAGAAGTAGG 

35 

CTTATTGAAA GACAATAATA GACATGTCGG 
GCGCCATCAT CCAGATAGTG ATACAAGTTT 
TCATGAAGAC ATTACGCCTT TTGCCTTCAA 

"° ACTTGAGGGA AAGACTGTCA CCATGGAAGA 

AAAACATGAA TGCTTCATCG TAGAAGGTGC 
CTATTTGGTC AGTCATGTCA TAAAAGCGTT 

4b TCGCCTTGGA ACCATTAATC ATACCTTTTT 

TCCAATCGCC GGAATTATCA TCAATGGAAT 
CAATCCTGAG ATGATTGAGC CCTTATCCCG 

50 TGCCAACGTG ACGAAAGAAA CCGTTCTACA 



TTGATTCCTG GTTAAACGAG CGGTTAGACA 25 6 0 

TGCGGTCAAT GGATGGAGCG CCGGTTCCAG 2 6 40 

TCTGGTCCTC AAACAATTAT TTAGGGCTCG 27 0 0 

AAACAGCATT GCAGCAATTT GGGACAGGAA 27 60 

CCGTCTGGCA TGAAAAGCTA GAAAAGAAGA 2 8 20 

TGCTGTTTTC GAGCGGTTAC TTGGCCAATG 28 6 0 

AAGATGTCAT TTTAAGTGAC CAGCTCAATC 2 9 40 

CTAAGGCTGA TACAGTTGTT TATCGGCATA 30 0 0 

ATGAAACACA GCGTTATCAG CGCCGTTTTA 30 60 

GCACAATCGC CCCTCTTGAT CAGATCATCT 312 0 

TCGTTGATGA TGCCCACGCA ACAGGAGTTT 3180 

ACTTTGGTGT TTGTCCCGAC ATTGTTATCG 32 4 0 

GAGGTTTTGC GGCAGGATCA GCGGTCTTCA 330 0 

TTATCTTTCA AACCGCTATT CCGCCAGCCA 33 60 

TCATTGAAGC CAGCAGGGAA AAACGACAGC 3 42 0 

CCAGTCTGAA GAATATGGGT TATGTGGTGA 3 48 0 

TCATTGGCGA TGCCCATAAA ACGGTCCTAT 354 0 

ATGCTCCTGC CATTCGGCCG CCAACCGTTG 3 60 0 

TCACGTCTGA CCACAGTATG GGTGATATTG 3 66 0 

GAAAGGAGCT GCACATCATT TGAGGGGTTT 372 0 

GAAAACGGTT ATATCCAGCG GTCTTGCTGC 378 0 

GGTGTATAAA CCATTTTTAA GCGGGATATC 38 4 0 

GCTGAAAGAT ATGTCGCAGA CCAGTCTTTC 3 90 0 

GGCGCCGCTT GCACCATACG TTGCAGGGAA 3960 

GGTTTTAAGC CATTGGGGGC GGATTAGAGA 4020 

AGGCGGTATT TCTGTGCCAT TGGGAGAGGA 4060 

GCAGCTTCCC ATGATTATTG TGGCGCGTCC 4 140 

AACTGTCAAA TATGCAGAAA GCATGGGGCT 4 2 00 

CAGTGACTCT CCTGATGAAG ATGAAAAAAC 4 2 60 

TGTGCCGATT TTAGGGGTTA CGCCAAAGCT 4 320 

TATGCTAAAA GACCATATCA ATCTATCATT 4 3 80 
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ACTGATGAAT CAAGTGGGGG TATGA3AATG 
CTGGCTGGAG CAGAAGTGAC TGAC3AAGAG 

5 GATATTTTGC TATTAATGCA CGGGGCTTTT 

GTAAAGCTCA ATATGATTAT GAATGCGAAA 
TGTTCACAGT CTGCGATTTC GAAAGCGCCG 

w ACGCTGCTTG AAGGCGCGAA GCGGGCGCAC 

GCAAGCGGCA GAGGTCCGTC TAACA3AGAA 
ATTAAAGAGA CGTATGGACT GAAGATTTGT 
GCGAAGCGGC TCAAAGATGC AGGAGTAGAC 
AGAAACCATT CAAACATCAC AACCTCACAT 
ATCGCAAAAG AATCGGGGCT GTCTCCGTGT 
AAACAGGATG TCATTGACAT CGCCAAAAGC 
GTGAATTTTT TGCATGCAAT TGATGGCACG 
CTGTATTGTT TAAAAGTGCT GGCGCTGTTC 
ATTTCCGGAG GAAGAGAGGT CAATCTCCGC 

25 

AACTCCATTT TTGTCGGAGA CTACTTAACA 
AAAATGCTGA GTGATTTAGG CTTTGAAGTT 
AGTGCGAAAA GCTGAAAGAA TCAATAAAAG 

30 TTGAACAGAA AG G AG AAAAT CACGTGACAA 

TGAAAAACCC ATATTCTTTT T ACGACACAT 
GTTTCTTAAA A7ACCCGGGC TGGTATGTCA 

^ AAGATGCGAG ATTCAAAGTC CGCACCCCGC 

TTTCACATGT GCAAAATCAA ATGATGGTGT 
GGACGCTTGC CAGCGGAGCG TTTACGCCGA 

40 TTGAAACTGT CCATCATTTG CTTGATCAAG 

CGGACTTTGC TTTTCCTTTA GCAAGTTTTG 
AAGATAGGGA GCAATTAAAG GAGTGGGCTG 

45 GCTCAAGAAA GGCATTAACA GAGGGCAAT A 

AAGAGCTGAT TCAAAAGAGA AAACGCCACC 
AGGGGAGAGA AAAGGATAAG CTGACGGAAG 
CGATCGCCGG ACATGAGACA ACGGTCAATC 

50 

AGCATCCAGA ACAGCTTTTG AAACTGAGAG 



AATCAATGGA TGGAACTCGC AGACCGGGTG 4 4 40 

GCGCTTTCAA TATTACATTG TCCTGATGAA 4 50 0 

CACATCAGAA AACACTTTTA CGGAAAAAAA 4 5 60 

TCCGGGCTCT GCCCGGAAAA CTGCGGCTAT 4 620 

ATTGAGTCTT ACCGGATGGT GAATAAGGAA 4 6 60 

GATCTGAATA TCGGCACATA TTGTATCGTG 4 7 40 

GTGGATCAGG TCGTAGATGC GGTTCAGGAA 4 8 00 

GCATGTCTTG GACTGTTGAA GCCAGAGCAG 4 3 60 

CGCTATAATC ATAATTTGAA TACGTCACAG 4 92 0 

ACATACGATG ACAGAGTCAA TACGGTTGAA 4 98 0 

TCAGGCGCGA TTATCGGGAT GAAGGAGACG 5 0 40 

TTGAAGGCTC TTGACGCGGA TTCCATTCCT 5100 

CCGTTAGAAG GCGTCAACGA ATTAAACCCG 5160 

CGTTTTATCA ATCCATCAAA AGAAATTCGC 5220 

ACATTGCAGC CATTAGGGCT TTACGCCGCA 52 8 0 

ACTGCCGGGC AAGAGGAGAC GGAGGATCAT 534 0 

GAATCAGTCG AAGAAATGAA GGCTAGTTTA 5 400 

CAATCGGTAT GATGTCGATT GTTTTTATTT 5 4 60 

TTGCATCGTC AACTGCATCT TCTGAGTTTT 5 520 

TGCGAGCTGT TCATCCTATC TATAAAGGGA 5580 

CAGGATATGA AGAAACGGCT GCTATTTTGA 56 4 0 

TGCCTGAGAG CTCAACCAAA TATCAGGACC 57 0 0 

TTCAGAACCA GCCTGATCAT AGACGATTGC 5 7 CO 

GAACGACAGA GAGTTATCAG CCGTATATCA 5 920 

TGCAAGGTAA AAAAAAGATG GAGGTCATTT 58 8 0 

TCATAGCTAA CATTATAGGT GTACCGGAGG 5 9 40 

CGAGTCTCAT TCAAACGATT GATTTTACCC 6 000 

TTATGGCTGT GCAGGCTATG GCATATTTCA 6 0 60 

CTCAACAGGA TATGATCAGC ATGCTCTTGA 6120 

AGGAGGCGGC ATCTACGTGC ATATTGCTG3 618 0 

TCATCAGCAA TTCAGTCCTT TGTCTGCTGC 62 4 0 

AAAATCCAGA TCTTATTGGT ACCGCAGTCG 63 0 0 
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AGGAATGTTT ACGCTATGAA AGCCCCACGC 
TTGACATCTG CGGGGTGACG ATCCGTCAAG 

5 GTAATCGAGA CCCTAGCATA TTCACGAACC 

ATCCGCATCT TTCATTCGGG CATG3CCATC 
TAGAAGCGCA AATTGCGATT AACACTCTTC 

10 ATTTTGAATG GCGGTATCGG CCGCTTTTTG 

CTTTTGAATA AGCCTAAGAA TGTGAGTGCC 
AATCTATAAA AAAGGGGAGT GAACATCGTG 

Itj GTGATTGGGA GACTGCTTGC TGAAGGGCTT 

AAAGATCATT TCGATGGCAA AGCCTCTTCC 
CTGTTGAAGA AGATTCCAAA AGATACCGAT 

70 AAATACGATA TTATGGACAT CGCTGAGTTT 

AGCTATTATC TGTGCCGTGC GGCAGCGGAG 
AGCAATCATG TCACAGATGT ATATGAAAAA 

25 ACAACAAGCG ATTATCCGCT GTCAAAAAAC 

CAGATCGGCC ATTTGTTTTA TTTGGAAAAT 
ACAGTCGTGA CAGATGAAAT GGATACGCTG 

., 0 CTTTCTCACC CCGATCTGCT GTCGATTTTC 

GGCACTTATT ACGCCGTCTC TGATAATCCG 
AATGAACTTG GGTTTTCGCC ACAAATCAAT 
GGAGCATAAT CATTTTCTAA GATTATGCTC 
GCAGCCCCCG CCCGGCCGGG GACACTGTTC 
TGCTACATTT TAGACACGAT ATCGTCACAT 
GCTTACAAAG GGAGGTGGGA GCTATCGCAC 
TGATTTATC3 TTTCTTGCTG ATGTTCGCTT 
TCGCAGCATC CTTTGCCACA GTAGAAGAAG 
CCACTTTTTC GCTAGATGCT TATCGCTACA 

^5 

TGCTTGTTTC 7GTGTTTGTG ACAG7GATAG 
TGATGGCTTA CGGGTTATCC CGCCGTGATT 
TCGTATTTAC GATGCTGTTT AGCGGCGGCA 

50 

TTGGATTGCT CGATTCTTAC TGGGCGCTTA 



AAATGACAGC CAGAGTTGCG TCAGAGGATA t3 60 

GAGAACAAGT CTATCTTTTG TTAGGAGCGG 6 4 2 0 

CCGAT3TCTT CGATATTACG AGAAGTCCTA 6 4 80 

ATGTTTGCTT AGGGTCCTCG CTG3CACGAT 6 540 

TGCAGCGAAT GCCCAGCCTT AATCTTGCGG 6 60 0 

GATTTCGGGC GCTTGAGGAG CTGCCGGTGA 6 66 0 

AAAAAAGTGT CAGCCCCGCC GAAAATGGGC Z~2Q 

AAAAAAGTGC TGATCGCCGG CGGAAATGGT 67 8 0 

ATTTCAGACT ATGAAGTGAC TGTGCTTGAT 6 9 4 0 

ATTCAGGCTG ACGCGGCAAA TTATGAGGAG 6 90 0 

GCCATCTTGA ATTTACTCGC TGTGAAAATC 6 9 60 

GAAAAAATGA CGGATGTTTT CTATAGGGCA 7 02 0 

CTCGGCATTC AAAAGCTCGT GTTCGCCAGC 7 080 

GACGGGCGCT CGCTCTTAGG ACGGGAAATC 7140 

TTGTACGGTG TATTAAAGCT GACCTCTGAA 7 20 0 

AAGCTATCAG TAATCAACCT TCGAATCGGA 72 60 

CATGAAAAAG AACGGACGAA AAAGACACTG 7 320 

AAAGCCGCCA TTGAGACCAA CATCCGGTAT 7 38 0 

GGCCGGCCAT GGTCCATTGA ATCTGCCGTG 7 44 0 

ACGGCTGAAC TTCTGAACGA GGAGGAGAAC 7 500 

TTTTTCTTTT GTTATCGGTC TCAATTCGCG 7 5 60 

AAATGATTAT AGACATGGCA ATCACAGATT 7 62 0 

GCTGAGCTCG GTTTCCAAAA ATATGATAAC 7 680 

ATTCACTGAA AAACCGTCTG TTTGATATG? 77 4 0 

TAATATGCGT ACTTCCGTTC ATTCATGTTA 7800 

TCGTG7CGAA AAAATTTATT TTAATACCGA 7*^0 

TTTTTTCAAC AGATATTATT TATAAGAGTT 7 92 0 

GCACTGCGGT CAGCATGTTT CTTTCGTCAC 7 98 0 

TAATCGGCCG GCAGCCGCTC ATGTTTCTCG SO 4 0 

TGATTCCGAC TTTCCTTGTG GTCAAATCGC 8100 

TTTTGCCGAC AGCCATTAAT GCCTTTAACC 3160 
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TGATCATTCT GAAAAACTTC TTTCAAAATA TCCCGTCAAG CCTGGAAGAG TCCGCGAAAA 82 2 0 

TTGACGGGTG CAATGATCTG GGCATATTCT TTAAAATTGT GCTGCCGCTG TCTCTTCCTG 92 3 0 

CGATCGCAAC GATTTCACTA TTTTATGCGG TCACGTATTG GAACACGTAT ATGACAGCGA 6 3 40 

TCTTGTACTT AAATGATTCA 3CAAAATGGC CAATTCAGGT GCTTCTGCGC CAAATCGTCA 8 4 CO 

TTGTATCAAG CGGTATGCAG GGGGATATGT CTGAAATGGG GTCGGGCAGC CCGCCGCCTG 8 4 60 

AGCAAACCAT NNNNNTGG 8 47 9 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

<B) TYPE: nucleic acid 

{C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
TTGACANNNN NNNNNNNNNN NNNTATATT 2 9 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

{B> TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D ) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3: 
TTGACANNNN NNNNNNNNNN NNNTATAAT 29 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 4: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
TTGTAANNNN NNNNNNNNNN NNNNTAATAT 30 

{2) INFORMATION FOR SEQUENCE I DENT I F I CAT ION NUMBER : 5: 
(i) SEQUENCE CHARACTERISTICS : 
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(A) LENGTH; 2 9 base pairs 

(D) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: b: 



TTGATANNNN NNNNNNNNNN NNNAAAAGT 2 9 

[2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS; double 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
TTGAAANNNN NNNNNNNNNN NNNTCTTAT 2 9 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 7: 
25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 

(B) TYPE: nucleic acid 
(CJ STRANDEDNESS: double 
(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 : 

AAGCTGTCCG GTTTTTGCAA AAGTGGCTGT GACTGTAAAA AGAAATCGAA AAAGACCGTT 60 

35 TTGTGTGAAA ACGGTCTTTT TGTTTCCTTT TAACCAACTG CCATAAATCG ATCCTTTCTT 12 0 

CTATTGACAG AAACAGGAGA GAATAATATA TTCTAATTGT TAACCTTTGA ATATAATTGG 180 

TTAACAATT? AGGTGAGAAG CGCTACACGT TCTTCAGTTA TCAGTGAAAG GGCGAGAAAT 2 40 

40 GATGCAAGAA GAAACTTTTT ATAGTGTCAG AATGAGGGCT TCAATGAATG GATCTCATGA 300 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 3: 
(i) SEQUENCE CHARACTERISTICS: 

45 

(A) LENGTH: 36 base pairs 
(BJ TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

5Q (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AATGTGTTAA CTTAAAAACT ATAGTTGGTT AACTAA 3 6 
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?o 



30 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER : 9: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
{D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID HQ: 9: 

CTAATTGTTA ACCTTTGAAT ATAATTGGTT AACAATTTAG 4 0 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 10 
<i> SEQUENCE CHARACTERISTICS ; 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION ; SEQ ID NO: 10: 

GGCCAAGCTT GTCGACCGAA ACAGCAGTTA TAAGGCAT 3 8 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 11 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(8) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GGCCCGTCTA GAGCTTCTCA CCTA 24 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION ^^UMBER: 12 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(BJ TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GGCCGAGAAG CTCTAGACGT TCTTCAGTTA TCAGT j 5 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 13 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
(DJ TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION ; SEQ ID NO: 13: 



CGCCAGGGTT TTCCCAGTCA Co AC 21 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 14 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(3) TYPE: nucleic acid 

(C) STRANDEONESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



TAGAAGAAAG GCTCGAGTTA TGGCAGTT 28 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 15 
25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

?0 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15; 

AACTGCCATA ACTCGAGCCT TTCTTCTA 2 8 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 16 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 9 amino acid3 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TCPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



Ala Val Arg Phe Leu Gin Lya Trp Leu 
1 5 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 17 
(1) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH; 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



Met Met Gin Glu Glu Thr Fhe Tyr Ser Val Arg Met Arg Ala Ser 

15 10 15 



Met 



Asn Gly Ser His Glu 
20 



is 

Claims 

1. A DNA comprising a DNA sequence that is selected from the group consisting of. 

(a) a DNA sequence of a gene that encodes a biotin biosynthetic enzyme of Bacillus subtilis, or or a 
?o species closely related to Bacillus subtilis; 

(b) a DNA sequence of a biologically active fragment of (a); and 

(c) a DNA sequence that is substantially homologous to (a) or (b). 

2. The DNA of claim 1, wherein said gene is selected from the group consisting of bioA, bioB, bioD, bioF, 
25 btoW. biol, and QRF2. 

3. The DNA of claim 1 , wherein said gene is bioA of B. subtilis, or a closely related species thereof 

4. The DNA of claim 1 wherein said gene is bioB of 5, subtilis, or a closely related species thereof. 

.30 

5. The DNA of claim 1, wherein said gene is the biol gene of B, subtilis, or a closely related species 
thereof 

6. The DNA of any of claims 1-5, wherein said DNA sequence is oporably linked to a transcriptional 
35 promoter. 

7. The DNA of any of claims 1-6, wherein said DNA comprises at least two of said DNA sequences. 

8. The DNA of claim 7, wherein a first one of said DNA sequences is operably linked to a first 
40 transcriptional promoter, and a second one of said DNA sequences is operably (inked to a second 

transcriptional promoter. 

9. The DNA of claim 8, wherein at least one of said promoters is a constitutive promoter. 

45 10. The DNA of claim 9, wherein said constitutive promoter is derived from the SP01 bacteriophage. 

11. The DNA of claim 8 wherein said first one of said DNA sequences is the one of a gene selected from 
the group consisting of bioA, bioB, bioD, bioF, and bioW of B. subtilis, or a closely related species 
thereof; and said second DNA sequence is the one of a btoi of B. subtilis, or a closely related species 

so thereof. 

12. The DNA of claim 8 or of claim 1 1 wherein at least the following DNA sequence(s) is operably linked to 
said second promoter: said bioA, said bioB, or both said bioA and said bioB. 

bb 13. The DNA of claim 6 wherein said DNA comprises the DNA sequence of the biotin operon of Bacillus 
subtilis, of a closely related species thereof, operably linked to said transcriptional promoter, 
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14. The DNA of any cf claims 1-13 comprising a regulatory site of a biotin operon of 9. subtilis or a r.r.sely 
related species, said regulatory site being mutated with respect to wild-type DNA and being selected 
from the group consisting of an operator, a promoter, a site of transcription termination, a s.te of mRNA 
processing, a nbosome binding site, and a site of cataooiite repression, said mutation be>ng either an 

s insertion, a substitution, or a deletion. 

15. A vector comprising a DNA as claimed in any one of claims 1-14. 

16. A ceii comprising the vector of claim 15 or a DNA as claimed in any one of claims 1-14. 

17. The ceil of claim 16, wherein said DNA is amplified to multiple copies in said cell. 

18. The cell of claim 16, wherein said DNA is stably integrated into the chromosome of said cell 

75 19. The cell of claim 18, wherein said DNA is amplified to multiple copies in said chromosome cf said coll 

20. The cell of claim 18 or claim 19, wherein said DNA is integrated at the bio locus of said chromosome. 

21. The ceil of any of claims 18-20, wherein said DNA is integrated at more than one site in said 
90 chromosome 

22. The cell of any one of claims 16-21, wherein said cell is characterized by a mutation that deregulates 
prcduciion of biotin or a biclin precursor, in addition to the presence of sa'd DNA. 

25 23. The cell cf any one of claims 16-22 wherein said cell produces an increase in bictin in comparison to 
wild-type cells lacking said DNA. 

24. The cell of claim 22, wherein said cell contains a mutation that confers resistance to azelaic acid. 
:<o 25. The ceil of claim 22, wherein said mutant cell is mutated in buA. 

26. The cell of any one of claims 16-25, wherein said cell is Bacillus subtilis, or a dcsely related species 
thereof. 

35 27. The cell of any one of claims 16-25. wherein said cell is Escherichia coil 

28. A recombinant protein encoded by a DNA as defined in any one of claims 1-5. 

29. A process for the production of biotm or a precursor thereof, said process comprising the steps of: 
40 (a) providing the cell of any one of claims 16-27; 

(b) culturing said cell for a time and under conditions which allow synthesis of biotin or said 
precursor, and 

(c) isolating said biotin or said precursor. 

4b 30. The process of claim 29, wherein said biotin or said precursor is secreted from sa:d host ceil and 
isolated from the extracellular media of said host cell 
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FIG. 1 
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FIG. 3 
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FIG. 14-1 
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FIG. 14-2 
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FIG. 14-3 

2551 TTGATTCCTG GTTAAACGAG CGGTTAGACA GAATGAAAGA AGCCCCCGTA 
2 601 CATCGTAACC TGCGGTCAAT GGATGGAGCG CCGGTTCCAG AGAGGAATAT 
2 651 TGATGGCGAA AATCAAACGG TCTCGTCCTC AAACAATTAT TTAGGGCTCG 
2701 CAAGCCATAG ACGTTTGATC GATGCAGCCC AAACAGCATT GCAGCAATTT 
2751 GGGACAGGAA GCAGCGGTTC ACGTTTAACG ACAGGCAATT CGGTCTGCCA 
2 801 TGAAAAGCTA GAAAAGAAGA TTGCCAGCTT TAAACTGACA GAAGCGGCCC 
2B51 TGCTGTTTTC GAGCGGTTAC TTGGCCAATG TCGGTGTCCT TTCATCCTTG 

2 9 01 CCAGAAAAGG AAGATGTCAT TTTAAGTGAC CAGCTCAATC ATGCAAGTAT 
2951 GATCGACGGC TGCCGACTTT CTAAGGCTGA TACAGTTGTT TATCCGCATA 
3001 TTGATATGAA TGATCTTGAA AACAAGCTGA ATGAAACACA GCGTTATCAG 
3051 CGCCGTTTTA TCGTAACAGA CGGAGTATTC AGCATGGATG GCACAATCCC 
3101 CCCTCTTGAT CAGATCATCT CACTTGCGAA ACGCTATCAT GCCTTCGTGG 
3151 TCGTTGATGA TGCCCACGCA ACAGGAGTTT TGGGCGATTC GGGACAAGGA 
3201 ACGAGTGAAT ACTTTGGTGT TTGTCCCGAC ATTGTTATCG GCACCTTAAG 
3251 CAAAGCTGTT GGCGCGGAAG GAGGTTTTGC GGCAGGATCA GCGGTCTTCA 

3 301 TCGACTTTTT GCTGAACCAT GCCAGAACAT TTATCTTTCA AACCGCTATT 
3351 CCGCCAGCCA GCTGTGCGGC TGCTCACGAG GCTTTCAACA TCATTGAAGC 
3 401 CAGCAGGGAA AAACGACAGC TTTTATTTTC TTATATCAGC ATGATCAGAA 
3451 CCAGTCTGAA GAATATGGGT TATGTGGTGA AAGGAGATCA CACACCGATT 
3 501 ATTCCTGTAG TCATTGGCGA TGCCCATAAA ACGGTCCTAT TTGCTGAAAA 
3 551 ACTGCAGGGC AAGGGAATTT ATGCTCCTGC CATTCGGCCG CCAACCGTTG 
3 601 CGCCGGGTGA AAGCCGGATT CGAATTACAA TCACGTCTGA CCACAGTATG 
3 651 GGTGATATTG ATCATTTGCT GCAAACATTT CATTCAATCG GAAAGGAGCT 
3701 GCACATCATT TGAGCGGTTT TTTTGTGACG GGAACTGATA CAGAAGTAGG 
3751 GAAAACGGTT ATATCCAGCG GTCTTGCTGC CTTATTGAAA GACAATAATA 
3801 GACATGTCGG GGTGTATAAA CCATTTTTAA GCGGGATATC GCGCCATCAT 
3 851 CCAGATAGTG ATACAAGTTT GCTGAAAGAT ATG7CGCAGA CCAGTCTTTC 
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FIG. 14-4 
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FIG. 14-5 



5251 ACATTGCAGC CATTAGGGCT TTACGCCGCA AACTCCATTT TTGTCCGAGA 

53 01 CTACTTAACA ACTGCCGGGC AAGAGGAGAC GGAGGATCAT AAAATGCTGA 

53 51 GTGATTTAGG CTTTCAAGTT GAATCAGTCG AAGAAATGAA GGCTAGTTTA 
5401 AGTGCGAAAA GCTGAAAGAA TCAATAAAAG CAATCGGTAT GATGTCGATT 

54 51 GTTTTTATTT TTGAACAGAA AGGAGAAAAT CACGTGACAA TTGCATCGTC 
5 501 AACTGCATCT TCTGAGTTTT TGAAAAACCC ATATTCTTTT TACGACACAT 
5551 TGCGAGCTGT TCATCCTATC TATAAAGGGA GTTTCTTAAA ATACCCGGCC 
5601 TGGTATGTCA CAGGATATGA AGAAACGGCT GCTATTTTGA AAGATGCGAC 
5 651 ATTCAAAGTC CGCACCCCGC TGCCTGAGAG CTCAACCAAA TATCAGGACC 
5701 TTTCACATGT GCAAAATCAA ATGATGCTGT TTCACAACCA GCCTGATCAT 
57 51 AGACGATTGC GGACGCTTGC CAGCGGAGCG TTTACGCCGA GAACGAGAGA 
5801 GAGTTATCAG CCGTATATCA TTGAAACTGT CCATCATTTG CTTGATCAAG 
5851 TGCAAGGTAA AAAAAAGATG GAGGTCATTT CGGACTTTGC TTTTCCTTTA 
5901 GCAAGTTTTG TCATAGCTAA CATTATAGGT GTACCGGAGG AAGATAGGGA 
59 51 GCAATTAAAG GAGTGGGCTG CGAGTCTCAT TCAAACGATT GATTTTACCC 
6001 GCTCAAGAAA GGCATTAACA GAGGGCAATA TTATGGCTGT GCAGGCTATG 
6051 GCATATTTCA AAGAGCTGAT TCAAAAGAGA AAACGCCACC CTCAACAGGA 
6101 TATGATCAGC ATGCTCTTGA AGGGGAGAGA AAAGGATAAG CTGACGGAAG 
6151 AGGAGGCGGC ATCTACCTGC ATATTGCTGG CGATCGCCGG ACATGAGACA 
6201 ACGGTCAATC TCATCAGCAA TTCAGTCCTT TGTCTGCTGC AGCATCCAGA 
6251 ACAGCTTTTG AAACTGAGAG AAAATCCAGA TCTTATTGGT ACCGCAGTCG 
63 01 AGGAATGTTT ACGCTATGAA AGCCCCACGC AAATGACAGC CAGAGTTGCG 

63 51 TCAGAGGATA TTGACATCTG CGGGGTGACG ATCCGTCAAG GAGAACAAGT 
6401 CTATCTTTTG TTAGGAGCGG CTAATCGAGA CCCTAGCATA TTCACGAACC 

64 51 CCGATGTCTT CGATATTACG AGAAGTCCTA ATCCGCATCT TTCATTCGGG 
6501 CATGGCCATC ATGTTTGCTT AGGGTCCTCC CTGGCACGAT TAGAAGCGCA 
6551 AATTGCGATT AACACTCTTC TGCAGCGAAT GCCCAGCCTT AATCTTGCGG 
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FIG. 14-6 
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FIG. 14-7 



79 51 GCACTGCGGT CAGCATGTTT CTTTCCTCAC TGATCGCTTA CGGGTTATCC 

8001 CGCCGTGATT TAATCGGCCG GCAGCCGCTC ATGT7TCTCG TCGTATTTAC 

3051 GATGCTCTTT AGCGGCGGCA TGATTCCGAC TTTCCTTGTG GTCAAATCGC 

8101 TTGGATTGCT CGATTCTTAC TGGGCGCTTA TTTTGCCGAC AGCCATTAAT 

"8151 GCCTTTAACC TGATCATTCT GAAAAACTTC TTTCAAAATA TCCCCTCAAG 

82 01 CCTGGAAGAG TCCGCGAAAA TTGACGGGTG CAATGATCTG GGCATATTCT 

B251 TTAAAATTGT GCTCCCGCTG TCTCTTCCTG CGATCGCAAC GATTTCACTA 

8301 TTTTATGCGG TCACGTATTG GAACACGTAT ATGACAGCGA TCTTGTACTT 

•;^.&351 AAATGATTCA GCXAAATGGC CAATTCAGGT GCTTCTGCGC CAAATCCTCA 

|?;i|401 TTGTATCAAG CGGTATGCAG GGGGATATGT CTGAAATGGG GTCGGGCAGC 

1 I&451 CCGCCGCCTG ACCAAACCAT KNNNNTGG 
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FIG. 16 

A. Generation of p BI0178 by PCtt 
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B. Ligation of pBIO!78 and pBI0152 to generate SVOl-l5-btoi&bioW) 
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FIG. 18 



Alt laic acid resistance of PA3 and PAS, 
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